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3 experiments are reported in which Ss were asked to judge the degree of 
contingency between responses and outcomes. They were exposed to 60 
trials on which a choice between 2 responses was followed by 1 of 2 possi- 
ble outcomes, Each S judged both contingent and noncontingent problems. 
Some Ss actually made response choices while others simply viewed the 
events, Judgments were made by Ss who attempted to produce a single 
favorable outcome or, on the other hand, to control the occurrence of two 
neutral outcomes. In all conditions the amount of contingency judged was 
correlated with the number of successful trials, but was entirely unrelated 
to the actual degree of contingency. Accuracy of judgment was not im- 
proved by pretraining Ss on selected examples, even though it was possible 
to remove the correlation between judgment and successes by means of ап 
appropriate selection of pretraining problems. The relation between every- 
day judgments of causal relations and the present experiment is considered. 


Whole No. 594, 1965 


x important part of human verbal 
knowledge about the everyday phys- 
al and social environment is knowledge 
about what causes what. No doubt much 
of that knowledge is acquired from others 
and entails an understanding, at various 
levels of detail, of how the relation between 
a particular cause and its effect is mediated. 
We know about the relation between the 
setting of a thermostat and the temperature 
of a house, not as a result of raw observa- 
tion, but through our understanding of the 
relation of thermostat to furnace and of 
furnace to heat. On the other hand, some 
knowledge about cause and effect sequences, 
whether valid or not, must arise primarily 
from the individual's experience with the 
way things happen. One may come to be- 
lieve that wet weather is the cause of vari- 
ous bodily ills even though one has little 
prior notion of how such a relation might 
be mediated. 

How are causes identified from experi- 
ence? There is no difficulty in identifying 
a cause when consequent follows antecedent 
quickly and regularly. The relation between 


| 
| 


1The experiments were carried out while the 
authors were members of staff of the Bell Tele- 
phone Laboratories, Murray Hill, New Jersey. 
К The support of this research by the laboratories is 
gratefully acknowledged. 


the movements of a steering wheel and the 
behavior of a car, or between the flick of а 
switch and the appearance of a light are 
quickly perceived. But causes are also iden- 
tified on the basis of less determinate ob- 
servations. Thus, one may decide that a re- 
mark made yesterday caused someone to 
change his behavior today, or that taking 
a drug produced recovery from an illness. 
It is clearly more difficult to correctly 
identify a causal relation in cases of this 
type. The increased difficulty arises, at least 
in part, from the fact that the outcome oc- 
curs with some frequency in the absence of 
the antecedent in question (e.g., recoveries 
sometimes occur without drugs); and the 
antecedents are sometimes present when the 
outcomes are not (taking a drug is not 
always followed by recovery). 

In the simple cases where the perception 
of a relation is immediate, the joint occur- 
rence of two events stands out against a 
background of experience in which neither 
event has appeared alone with appreciable 
frequency. A single joint occurrence may, in 
such cases, lead to the conviction that the 
events are causally related. In the less 
determinate or noisier cases, however, the 
joint occurrence of antecedent and conse- 
quent does not have the same force. If ante- 
cedent and consequent each occur without 
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the other, their joint occurrence can arise 
through chance as well as through the re- 
sult of causal relation. Thus the problem 
becomes one of estimating whether the 
frequency of joint occurrence exceeds what 
might be expected by chance. The estimate 
must rest upon a series of observations. 
When dealing with imperfectly related 
events of this sort, it seems more appropri- 
ate to speak of the judgment of a causal 
relation rather than of its perception. 

The present experiments were designed 
to yield some preliminary information on 
how accurately people judge the degree of 
relation between events when the actual de- 
pendency is varied from zero (independent 
events) to some intermediate degree well 
short of a determinate or completelv de- 
pendent relation. They were also concerned 
with the basis of the judgments. The situa- 
tion, in brief, was this. The subjects were 
given two response buttons with which they 
tried to influence the appearance of two 
outcomes. On each of a series of trials they 
chose to press one of the response buttons 
and were then shown the outcome which 
followed. At the end of the series of trials 
they were asked to judge the degree of 
control which their response choices had 
exerted over the outcomes. We used the 
term “control” rather than alternative 
terms such as “dependency” or “correla- 
tion" because in the context of the task it 
seemed to be the most natural way to com- 
municate the technical meaning of con- 
tingeney with everyday language. 

It will be useful to have an index of the 
actual degree of control or contingency be- 
tween response choices and outcomes. The 
basic meaning of control is that the out- 
come depends upon the response. More ex- 
actly, there is control when the probability 
of a particular outcome given one response 
is different from the probability of that out- 
come given another response. The magni- 
tude of the difference in these conditional 
outcome probabilities provides a simple in- 
dex of the amount of control. For the pres- 
ent case with response alternatives R4 and 
Rs and outcomes О; and Os, the index of 
contingency, AP, is given by: |Pr(O,/R,) — 
Pr(O;/R;)|. The expression Pr(O;/R) is 
read, the probability of О, given R,. The 


range of values of AP is from one (com 
plete control) to zero (no control). It " 
zero when the probability of O, given R, 
is the same as the probability of O; given 
Re. It is one when O, always follows given 
К, and never follows given Re, or vice 
versa. The value of the index is unchanged 
if the conditional probabilities for О» are 
used in place of those for О, since P(O;) = ! 
1 — P(O,). 

If the four possible response-outcome 
pairs are arranged in a double entry (2 x 
2) table with cells labeled а = Ri, 0i; 
b = Ry, O2 ;¢c = Re, 01; d = Re, О, the 
AP index is given by 


Р‏ | ا کون 
a+b c+d\’‏ 
which simplifies to‏ 
ad — be‏ 


(a + b)(e +d) |" 


Two experiments, one by Inhelder and 
Piaget (1958) and one by Smedslund 
(1963), are directly relevant to the present 
problem. 

Inhelder and Piaget examined the con- 
cept of correlation in children of about 10- 
15 years of age. The children were shown 
a number of cards each with a face drawn 
on it. The faces had blue or brown eyes and 
blonde or brown hair. Each subject was 
asked about the relation of eye color to hair 
color for each of several different sets of 
cards. If the four possible pairings of eye 
color with hair color are arranged in a 2 X 2 
table with cells labeled as follows: a = ® 
blue eyes and blonde hair, b = blue eyes 
and brown hair, с = brown eyes and blonde 
hair, d = brown eyes and brown hair, then 
the a and d cases, which make up one di- 
agonal, are considered to be the confirming 
cases, while the b and c cases on the other 
diagonal are the nonconfirming cases. The 
child was said to be using an explicit no- 
tion of correlation if his answers were based 
on the difference between the number of 
confirming and the number of nonconfirm- | 
ing cases in the set. | 


© 


^In the absence of a specific hypothesis, the 
confirming cases are considered to be those on the 
diagonal having the larger total, whether a + d or 
b+c. 
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Two stages in the child's approach are 
füstinguished. In the first the child may 
organize the pictures into the four pairings; 
may talk about the chances of having, for 
example, blonde hair if you have blue eyes; 
and may identify confirming and noncon- 
firming cases. However, two features of the 
concept of correlation are missing. The first 
| is that the a and d cases are not seen as 
equivalent and combined into one total, nor 
are the b and с cases taken together. Some 
children who do combine the cases properly, 
however, run into a second difficulty when 
they fail to relate the а + d cases to the 
b + с cases. In the more advanced stage 
of thinking these difficulties are overcome, 
mand the child spontaneously relates con- 
firming cases to nonconfirming cases and 
judges correlation in terms of the balance 
between the two. Concerning the proportion 
of children reaching this stage, the authors 
state only, 


It is usually toward 14-15 years that the frequency 
of these cases is high enough to define a stage. 


These results provide grounds, although 
not strong grounds, for expecting that adults 
are capable of making appropriate judg- 
ments of contingency in the present experi- 
ment, The grounds are weak on two counts. 
First, the data were displayed in quite a 
different manner. The instances upon which 
the judgment was based were small in num- 
ber, they were all in view at one time, and 
they could be arranged by the subject into 
groups corresponding to the four types of 
^ pairings. In the present experiment, on the 

other hand, the instances are produced by 
the subject over an extended series of trials. 

A second, and more basic difference is that 

the logie of the concept of contingency аз 
formulated by Inhelder and Piaget is less 
generally applicable than is the logic en- 
, tailed by the AP index. The difference be- 
tween the sum of the confirming cases and 
the sum of the nonconfirming cases can 
serve as an index of contingency only if the 
two states of at least one of the variables 
appear equally often. Otherwise, the sums 
may differ even though the variables are 
independent. For example, consider a set 
^» of instances in which eye color and hair 
color are in fact independent, but blue eyes 


predominate over brown eyes, and blonde 
hair predominates over brown hair. For a 
particular example, let a = 8, = 2, c = 4, 
d = 1, where, as before, the letters stand 
for frequencies of the joint occurrence of 
each of the four possible pairings. Here, 
there are more confirming cases (a + d) 
than nonconfirming cases (b + c). The 
AP index, however, is zero since the proba- 
bility of having blonde hair given blue eyes, 
84 o, is not different from the probability of 
having blonde hair given brown eyes, or 
54. (The difference in the formulations can 
also be appreciated by noting that the 
numerator of the expression for AP in terms 
of cell frequencies is the difference between 
the products of the cell frequencies on the 
diagenals rather than a difference in their 
sums). 

One cannot tell whether the successful 
subjects judged the correlation in terms of 
proportions or frequencies. In a number of 
the protocols given by Inhelder and Piaget 
the children do talk about “chances” or 
proportions rather than raw frequencies, 
and the authors make the general point 
that the concept of probability develops 
before that of correlation. However, in 
none of the cases reported were subjects pre- 
sented with sets of instances containing 
disproportionate frequencies in the states of 
both variables. 

The format of Smedslund’s experiment 
was more similar to that of the present 
experiment. The subjects, who were nurses, 
attempted to judge the connection between 
a symptom and a diagnosis. On each of a 
series of cards a set of letters representing 
symptoms appeared together with another 
set of letters representing diagnoses. The 
attention of the subjects was directed toward 
whether or not a connection existed between 
one particular symptom and one particular 
diagnosis. These data can also be cast 
into a 2 X 2 table. The cells of the table 
contain the frequencies of the four possible 
pairings: presence or absence of the symp- 
tom with presence or absence of the diag- 
nosis. 

The judgments obtained by Smedslund 
(1963) showed no relation to the actual 
contingency between symptom and diag- 
nosis. There was a substantial correlation 
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between just the frequency with which 
symptom and diagnosis appeared together 
(positive confirming cases only) and the 
number of subjects who thought that symp- 
tom and diagnosis were related. Smedslund 
concluded, 


normal adulta with no training in statistics do 
not have a cognitive structure isomorphic with 
correlation 


The effects of noncontingent reinforce- 
ment on performance in learning tasks has 
been studied. Although it is not clear ex- 
actly how to relate performance under non- 
contingent reinforcement to a judgment of 
contingency made after the performance, 
the experiments are suggestive. Wright 
(1962) used noncontingent schedules of re- 
ward in the setting of a trial-and-error 
problem. With higher frequencies of re- 
ward, response patterns were more o-derly 
than they were at intermediate frequencies 
of reward. Bruner and Revusky (1961) pro- 
vided the subjects with several telegraph 
keys. Pressing one of the keys resulted in 
reinforcement at certain times, but the 
other keys were nonfunctional. The non- 
functional keys, however, were pressed in 
systematic patterns during the intervals be- 
tween reinforcements. When questioned, the 
subjects reported their belief that the en- 
tire response pattern was required to pro- 
duce the reward. In an unpublished experi- 
ment by the senior author, noncontingent 
reward was used in the setting of a concept 
formation task. The subjects were shown 
two-digit numbers and asked to respond 
with a third number. The experimenter pro- 
nounced their answer “correct” or “incor- 
rect” on each trial according to noncontin- 
gent random schedules. The subjects formed 
rules which typically entailed the use of 
several different arithmetical operations for 
different types of digit pairs. There were 
indications that rules were held with 
greater confidence when the fraction of 
trials "correct" was higher. Hake and Hy- 
man (1953) had subjects observe a random 
series of binary digits and try to predict 
each succeeding digit. They concluded that 
the subjects responded 


as though the series were composed of small sub- 
sequences some of which are dependable cues to 
the future behavior of the series. 


If the subjects had been asked about u 
t 


tingency, they might have said that wha 
is about to appear depends upon what has 
just appeared, although, of course, there is 
no such dependency in a random series. 

The general impression which is conveyed 
by the results of learning experiments with 
noncontingent outcomes is that the subjects 
are surprisingly insensitive to the distine- 
tion between contingent and noncontingent 
arrangements. They tend to behave as 
though outcomes depend on responses, or 
as though one symbol can be predicted from 
another, when the events are in fact inde- 
pendent. Further, it is possible to read into 
some of these experiments the notion that 
higher frequencies of reward, or of correct 
prediction, encourage a belief in contin- 
gency. 

Although previous work provides little 
basis for a prediction of how well the con- 
tingency between responses and outcomes 
might be evaluated by the subjects in the 
present experiment, it does suggest some 


factors which would be expected to produce | 


distortions of judgment. 

It appears that confirming cases are 
given considerable weight in the judgment, 
while the role of nonconfirming cases is less 
clear. However, even if nonconfirming as 
well as confirming cases were taken into ac- 
count, it is not difficult to see that under 
certain conditions, a subject might respond 
in a way which would generate an excess of 
confirming over nonconfirming cases even 
though responses and outcomes were, in 
fact, entirely unrelated. Suppose that one 
of the outcomes is preferred over the others, 
e.g., it represents a score point, and that it 
is programed to appear frequently and in- 
dependently of responses. The response 
choices made by the subject at the outset 
will thus be accompanied by frequent scor- 
ing. If scores reinforce, the response chosen 
at the outset is likely to be maintained to 
the virtual exclusion of other alternatives. 
The predominance of one response (or pat- 
tern of responses) together with one out- 
come will yield an excess of confirming 
over nonconfirming cases which, in turn, 
might lead to a spurious belief in control. 
The situation is analogous to the previous 
example in which the predominance 0 
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blue eyes and blonde hair gave an excess of 
onfirming cases even though eye and hair 
volor were independent. The confirming 
are, of course, concentrated in a single 
of а contingency table and the AP in- 
remains at 0. 
‘On the other hand, an excess of confirm- 
ing enses cannot arise in the noncontingent 
- ве, no matter how strongly a particular 
e predominates, if response alterna- 
are used with equal frequencies. 
1 fore, a change in the character of the 
—mubject's task which would lead him to a 
more balanced use of response alternatives 
should reduce the presumed tendency for 
belief in control to increase with an in- 
ing predominance of one outcome. The 
rimination between contingent and non- 
tingent cases should improve. 
This conjecture was examined in the first 
iment. The same set of problems was 
by subjects whose objectives while 
ing the problems were set by one of two 
‘contrasting instructions: the instruction to 
Score, or to control. Under the score instruc- 
| Чоп, one of the two outcomes constituted а 
"score," the other a “no score,” and the sub- 
ject was instructed, in part, to score as often 
as possible. Under the control instruction, the 
bs outcomes were neutral symbols; and the 
‘subject was instructed to learn how to pro- 
duce each of them at will on any trial. The 
control instruction was expected to produce 
а more balanced use of response alternatives 
and, as a result, more valid judgments of 
contingency. 
_ An ancillary purpose of the first experi- 
' ment was to find out if active involvement 
as a performer in a learning task adversely 
affects the validity of judgment of control. 
Paired with each subject who made the re- 
sponses (active subject) was one who sim- 
ply watched a display of the responses and 
, Outcomes (spectator). Both subjects judged 
control at the end of the series of trials. 


E 


ExPERIMENT І 


Method 


Subjects. The subjects were 50 college gradu- 
ates, males and females, employed at the White 
Plains, New York, office of the Long Lines Di- 

& Vision, American Telephone and Telegraph Com- 
pany. Their ages ranged from 21 to 58 years with 
à median of 38. 


Apparatus. The active subject was seated in 
front of & control box and a display panel. For 
subjects under the score instruction, two buttons 
labeled “R,” and "R,", а button labeled "clear", 
and one labeled "test" were available on the con- 
trol box. On each trial the active subject made а 
response choice, pressing either R, or Re. 

was registered immediately on the dis- 

by the illumination of the numeral 

1 or 2. The indication remained until the end of 
If for any reason the subject wished to 

change his response choice at this point, he could 
do so by pressing the clear button and making а 
new response choice. He then pressed the test 
button which was followed immediately by either 
the “score” outcome (O,) or the “no score” out- 


He 
Н 


Neutral symbols were used to represent the out- 
comes: O, was shown by a lighted circle, and О, 
was shown by a lighted square. Two additional 
buttons, referred to as "call" buttons, were made 
available on the control box. The call buttons 
were labeled with a square or a circle to cor- 
respond to the outcomes, Under the control in- 
struction, the subject indicated, by pressing one 
of the call buttons at the beginning of each trial, 
which outcome he was trying to produce on that ' 
trial. The called-for outcome was registered on the 
display by means of small pilot lamps located next 
to the unilluminated outcome figures. The sub- 
ject then made a response choice and operated the 
test button. Thus, under the control instruction, 
two choices were made on each trial: first a choice 
of outcome, made by pressing a call button, and 
then a response choice. 

The subject in the spectator position was visu- 
ally isolated from the active subject, but viewed 
a duplicate display. 

The events displayed were automatically con- 
trolled by the subject’s responses through relays 
and a programing device. Operation of the test 
button activated a teletype reader which read 
punched paper tape to produce the appropriate 
outcome. Two outcome sequences were punched 
on different channels of the tape. The response 
choice determined which channel was to produce 
the outcome for that trial. In the case of prob- 
lems in which outcomes were not contingent 
upon responses, identical outcome sequences were 
punched on both channels. 

Counters recorded the events of each trial so 
that the frequency of all response-outcome combi- 
nations and, for the subjects under the control 
instruction, call-response-outcome combinations, 
could be obtained readily. 

Instructions. The instructions were not read to 
the subject, but they were explained according to 
a plan to which the experimenter adhered closely. 
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The same wording was used to express the key 
ideas to all subjects. Questions were answered as 
they arose, Spectator subjects listened while in- 
structions were given to the active subjects. 

For the score instruction, the task was explained 
as one of "finding a way to respond which will 
make the score light appear as often as possible,” 
and for the control instruction as one of “finding a 
way to control which of the outcomes (square or 
circle) will appear on any trial" The subjects 
were then told that at the end of each of five sep- 
arate problems they were to make a judgment of 
the degree of control which had been exerted 
over the outeomes by response choices. They were 
shown a scale marked at intervals of 10 with ex- 
treme values of 0 and 100. The extremes were 
labeled No Control and Complete Control. The 
subjects were then told : 

After each problem you are to indicate your 

judgment of control by putting an “X” some 

place on the scale: at 100 if eomplete control 
has been achieved, at 0 if no control has been 
achieved, and somewhere between these ex- 
tremes if some but not complete control has 
been achieved over the outcomes. Complete 
control means that you can produce the score 
light or the no score light (alternatively, the 
circle or the square) on any trial by your choice 
of responses. No control means that you have 
found no way to make response choices so as 
to influence the outcomes. Intermediate degrees 
of control mean that your choice of responses 
influences which outcome appears even though 
it does not completely determine the outcome. 

It should be noted that in instructing the sub- 
jects who were in the score condition, it was ex- 
plicitly stated that control means the ability to 
produce the “no score” light as well as to produce 
the “score” light. Similarly, in the control in- 
struction it was stated explicitly that control 
means the ability to produce each of the two 
outcomes, at will, on any trial. 

The subjects were told that any one of the 
following three states of affairs might be found 
on any problem: (a) response choices do not af- 
fect the outcomes, ie. there is no control; (b) 
one response produces one outcome more often 
than does the other response; or that (c) different 
patterns of responses produce different outcomes. 
The possibility that the correct judgment for a 
given problem might be one of zero control was 
stated explicitly, 

Both spectator subjects and active subjects 
were offered the option of keeping a record of 
events. Blank space was available in the test 
booklet for this purpose. The subjects were told 
not to look back at earlier records once a new 
problem had i 

„Problems. A problem consisted of 60 self-paced 
trials. The statistical structure of the problems is 
shown in Table 1. Each subject worked five prob- 
lems. Three were noncontingent (A, B, C), and two 
were contingent (X, Y). (The pretraining prob- 
lems shown in Table 1 were used only in Experi- 


ment Ш). Note that the noncontingent prob! 
differ in the degree of bias in the outcome proba 
bilities. Since O, stands for "seore" in the seom 
instruction, the number of scores in the 60 
for Problems А, B, C will be, in order, 30, 48, 
and 8 scores. In the case of contingent problems 
(X, Y) the number of scores will depend on 
sponse choices. 

Design and Procedure. The assignment of su 
jects to the score or control instructions and 
the active or spectator positions was made 
random. The order of problems was governed b 
5 X 5 Latin squares, Twelve different randomiza= 
tions of the trial sequence were used on eae 
problem. These randomizations were subject te 
the restriction that a given outcome had the same 
programed frequency in the first and second 
of the 60 trials. In the case of contingent prob. 
lems, the assignment of conditional outcome prob 
abilities to В, or К. was interchanged so th 1 
each problem was run equally often with В, or R 
leading to O; most frequently. 

The subjects were given a 10-trial practice 
run to familiarize them with the operation of he 
equipment. They recorded their judgments for 
each of the five problems on a separate page of f 
booklet. The scale descri 


“How much control do you [‘does the other sub 
ject’ in the case of the spectator] have over 
outcomes?” 

Data from two pairs of subjects were exclude 
from the analysis since in each case one of thi 


data were obtained on 10 pairs of subjects und 
the score instruction and on 13 pairs under 
control instruction. 


Results and. Discussion 


Effect of Instruction and Involvement 
Results on the judgment of control am 
given in Table 2. Instructions had a stro 
effect on judgment. The effect was partic i 
larly evident in the case of Problem © 
where, under the score instruction, the me- 
dian judgment for active subjects was 0, 
while under the control instruction it wa 
55.0. Under neither instruction, howe 
did judgment follow the AP index at a 
closely. In both cases, some noncontin 
problems were judged higher than one о 
the contingent problems. That there was, in 
fact, no significant relation of judgment to 
contingency in either group is supported by 
correlational data based on individual per 
formance. These data are given below 
another connection. 

The assumption that, for noncontinge 
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TABLE 1 


CowprrioNAL Ovrcome PROBABILITIES AND AP 
Inpex ron Test AND PnETRAINING PROBLEMS 


———————— EE 


Conditional outcome 
Problem probabilities ar 
Pr(O/R) PrfOURs) 
Test* 
Noncontingent 
.500 .500 0 
B .800 .800 0 
с .133 .133 0 
Contingent 
X .800 .500 3 
Y .800 .200 6 
Pretraining® 
Noncontingent 
A’ .500 .500 0 
с’ .900 .900 0 
Contingent 
¥ .900 .100 8 


* Experiments I, II, and IIL. 
b Experiment III. 


problems, higher frequencies of scoring 
would produce a greater concentration on 
one response alternative was not borne out. 
An analysis of variance of individual re- 
sponse biases (deviations from an equally 
frequent use of В, and Re) for Problems 
A, B, C under the score instruction showed 
no significant effect of the frequency of 
scores on bias. A similar analysis of re- 
sponse bias for these problems under the 
control instruction also showed no effect. 
The possibility remains that the amount of 
repetition of a partieular sequence of re- 
sponses was affected by the frequency of 
scores, but this cannot be ascertained with 
the present data. 

In any case, the idea that a response bias 
would produce an excess of confirming over 


nonconfirming cases, and thus lead to a 
high degree of judged control for noncon- 
tingent problems, seems incorrect on an- 
other ground. The results for Problem A 
almost rule out the notion that judgment 
was based on a balance of confirming and 
nonconfirming cases, In Problem A the two 
outcomes occur equally often. This means 
that, except for sampling errors, the num- 
ber of confirming cases (the conjunction of 
one outcome with a particular response or 
pattern of responses plus the conjunction 
of the other outcome with some other re- 
sponse) will equal the number of noncon- 
firming cases, no matter how much bias 
exists in the use of responses. But, in spite 
of the equality between confirming and 
nonconfirming cases, the subjects judged, 
on the average, that responses controlled 
outcomes to a moderately strong degree in 
this problem. 

The degree of active involvement had no 
significant effect on the judgment of con- 
trol. Of the 10 comparisons of mean judg- 
ment made by active as against spectator 
subjects (5 problems X two instructions) 
only a single comparison yielded a value of 
t for which p < .05. This occurred in Prob- 
lem C under the score instruction in which 
a very small difference in mean judgment 
was accompanied by a very low standard 
deviation of judgment. 

The rank-order correlation of the median 
judgment on each problem for active sub- 
jects with spectator subjects was 1.0 under 
the score instruction and .9 under the con- 
trol instruction. Thus it is clear that some 
feature of the problems does result in sys- 
tematic differences in the degree of judged 
control. 


TABLE 2 


MEDIAN, MEAN, AND STANDARD DEVIATION OF JUDGED CONTROL BY PROBLEMS (wirH AP VALUES) 
AND EXPERIMENTAL CONDITIONS 
ДОА oU encre cm 


Score instruction Control instruction 
Problem AP Active subject. Spectator subject Active subject Spectator subject 
Md M SD Md M SD ма м SD Md M SD 


A o 20.0 19.9 12.0 20.0 18.1 16.8 30.0 33.5 18.4 27.0 24.2 22.4 
B 0 725 66.6 16.1 80.0 71.0 19.5 50.0 54.2 24.0 40.0 38.6 23.1 
© 0 0 4 12 0 53 69 55.0 524 29.7 50.0 51.4 23.2 
X 3 55.0 559 20.1 70.0 59.6 27.8 40.0 35.5 22.2 30.0 31.6 21.4 
Y % 850 56.9 16.5 70.0 58.0 25.6 865.0 61.2 25.0 48.0 43.6 30.1 
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Ета. 1. Median judged control as a function of 
mean successes, 


Prediction of Judgment from Successes. 
The feature of the problem which best pre- 
dicts judgment turns out to be the number 
of successful trials. In the case of the score 
instruction the number of successes is sim- 
ply the number of times the score light ap- 
pears. Under the control instruction, it will 
be recalled, the subject indicates by means 
of call buttons the outcome which he is try- 
ing to produce on each trial. We may count 
ав а suecess any trial on which the outcome 
he is trying for appears. 

The relation of median judged control to 
the mean number of successes is shown in 
Figure 1. The amount of judged control 
shows a similar increasing trend against 
suecesses under both control and score in- 
struetions and for both active and spectator 
subjects. 

The product-moment correlation of in- 
dividual judgments with number of suc- 
cesses, based on all subjects and all prob- 
lems, was .70. The correlation was .72 for 
active subjects and .68 for spectator sub- 
jects. A study of the scatter plots for other 
subgroups, and for contingent and noncon- 
tingent problems separately, did not sug- 
gest any systematic differences in the re- 
gression of judgment on success. 

A correlational analysis was also carried 
out on the relation of judgment to contin- 
gency. The response-outcome contingencies 
in the 2 X 2 tables which result from the 
subject’s choices and outcomes will differ 
from the nominal contingencies because of 
sampling errors. It is therefore possible that 
judgment is correlated to some extent with 
the actual contingency even though it bears 
no relation to the nominal contingency. 
However, the partial correlations between 
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judgments and the chi-square values based) 
on actual response-outcome frequencies in 
the 2 х 2 tables with successes held com 
stant averaged only .13 for the different 
groups of the experiment and were in m 
ease significantly different from zero. 
similar analysis on the call-outcome con- 
tingencies for the control instruction ga 
an average correlation of only .08. 
Of the 46 subjects, 23 made some record 
of events during the problems. The corre 
tion of judgment with success for all sub 
jects making records, taken over all probs 
lems and conditions, was .73, while that for 
subjects not keeping records was .62. 
In summary, the correlational analysis 
shows no evidence that judgment was sy 
tematically influenced by any feature o! 
the problems other than the number of suc- 
cesses in 60 trials, nor by any of the experi 
mental conditions except insofar as these 
conditions affected the number of successe 
Factors Affecting Frequency of Success 
In the case of the noncontingent problem 
(A, B, C) under the score instruction, 
number of scores, and hence the number û 
suecesses, is completely determined in ad 
vance by the tape program. However, 
der the control instruction the number û 
suecesses (agreements between the called 
for outcome and the actual outcome) de- 
pends jointly upon the relative frequency 
with which each outcome appears and with 
which it is called for. The expected number 
of successes for a 60-trial problem is given 
by: 60 [P(Ci) Р(О,) + P(C2) Р(О»)] 
where P(C;) is the probability of calling 
for O, and P(O;) is the probability of O1- 
For any problem with unequal outcome 
probabilities, the expected number of suc- 
cesses increases as the probability of calling 
for the more frequent outcome increases. 
The subjects did tend to bias calling fre- 
quencies toward outcome frequencies as 
shown by the results given in Table 3. Dun- 
nett’s procedure for comparing several 
means with a control mean (Steel & Torrie, 
1960) was used to test the significance Ot; 
the difference in calling frequencies. The 
value of 32.4 obtained in Problem A, in 
which the outcome frequency was unbiased, 
was the control mean. The departure from: 
this value was significant in Problem C (P 


2 
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< 105), but not in Problem B. As a con- 

sequence of the calling bias, the obtained 
mean number of successes in Problem С 
was 39.0 which is significantly larger than 
the expected number of 30 based on equally 
frequent calls for O, and О, (t = 397, 
p < 001). 

By trying more often to produce that 
outcome which is preprogramed to appear 
more often, the subjects produce more suc- 
cesses and thus judge a higher degree of 
control, The tendency to match the prob- 
ability of predicting an outcome to its prob- 
ability of appearing is a well-known result 
when the subject’s task is one of prediction 
(e.g., Grant, Hake, & Hornseth, 1951). The 
presence of a similar trend when the sub- 
jects are instructed to control outcomes 
suggests that they may fail to distinguish 
the prediction from the control of outcomes. 

In the case of contingent problems under 
the score instruction, successes increase 
with the proportion of trials on which the 
response associated with the higher condi- 
tional score probability is used. A signifi- 
cant preference for the response choice as- 
sociated with the higher conditional score 
probability did occur for Problem Y (ob- 
tained mean frequency of 44.8 against an 
expectation of 30; ё = 7.57, p < 001) 
where the contingency is strongest, but not 
for Problem X (obtained mean frequency 
of 35.0). Thus, in Problem Y the mean 
number of successes was increased from the 
expectation of 30 to an obtained value of 
38.6 (t = 6.83, p < .001). 

Results for the control instruction were 
similar. In Problem Y, those call-response 
combinations which maximize the expected 
number of agreements between call and 
outcome were used with significantly 
greater frequency than expected by chance 
(obtained mean of 47.5 against an expecta- 
tion of 30; t = 6.16, p < .001). As a result, 
the number of successes was increased from 
an expectation of 30 to an obtained mean 
of 403 (t = 6.83, p < .001). 

The results on response choices for con- 
tingent problems show that the stronger 
contingency in Problem Y did have an ef- 
feet on performance even though, as previ- 
ously shown, there is no evidence that con- 
tingeney had any direct effect on judgment. 


TABLE 3 
Mean Catt AwD Ovrcomm FREQUENCIES FOR 
Хохсохтіховхт PROBLEMS, CONTROL 


INSTRUCTION 
Problem "T3 * Proquancy iO: » 
e L ————— 
A 32.4 30 
B 36.0 48 
c 16.8 8 


Meaning of Judgment. The absence of a 
relation between judged control and actual 
contingency in any of the experimental 
groups makes it quite unclear as to what 
the subject means by the judgment. One 
would like to know how the judgment is 
related to other statements which the sub- 
ject might be ready to make about the con- 
nection between his performance and the 
outcomes. In particular, does the judgment 
of а high degree of control carry with it the 
implication that the proportion of outcomes 
of a given kind can be greatly altered 
through responses? Perhaps the subjects 
take the word “control” to be synonymous 
with “getting what you want,” and not 
with the ability to alter what you get. If 
so, the subject might actually have a cor- 
rect appreciation of the degree to which 
outcomes can be altered even though his 
judgments of control are unrelated to con- 
tingency. On the other hand, it may be that 
when the subject judges а high degree of 
control he also believes that he is able to 
alter outcomes. 

These questions were examined in Ex- 
periment II in which the subject was asked 
both to estimate his ability to alter out- 
comes and, as before, to judge control. 


ExrERIMENT П 


Two different sets of questions were used 
with separate groups of subjects in an at- 
tempt to assess the subject’s belief in his 
ability to alter outcomes. The questions 
were answered at the end of each problem. 
In one set, referred to as the “switched-in- 
tention” set, the subject is first asked to 
estimate how often he would be able to pro- 
duce О, if he were given another 60 trials. 
He is next asked how often, given still 
another 60 trials, he could produce О» if he 
switched his intention to the production of 
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that outcome. These two estimates ean be 
used to arrive at a subjective AP index, or 
АР”. The first answer is taken as an esti- 
mate of the probability of O, given what- 
ever response or pattern of responses the 
subject believes is most likely to produce 
O,. The second answer yields, by subtrac- 
tion from 60, an estimate of the probability 
of O, when the subject is trying to avoid it, 
ie, when he is trying to produce Os. The 
value of AP' is the difference between these 
two conditional probabilities of O,. The 
logie behind the computation of AP’, the 
subjective index, is no different than that 
behind the computation of the AP index of 
contingency. 
Another set of questions is-referred to as 
the “random-player” set. The subject is 
first asked how often in 60 trials he can 
produce whatever outcome he feels best 
able to produce. He is then asked how often 
that outcome would occur if the response 
choice had been made by chance, i.e., by 
the flip of a coin. The difference between 
the subject’s estimate of how often he can 
produce the chosen outcome, and his esti- 
mate of how often it would be produced by 
а random player, also provides an index of 
his belief in his ability to alter outcomes. 
The actual values of AP’ based on the 
answers to the switched-intention questions 
might be expected to run higher than the 
values based on the answers to the random- 
player questions. Presumably, a greater 
difference in outcome probabilities is pro- 
duced by exerting control in two opposing 
directions (i.e., in the attempt to first maxi- 
mize one outcome and then to maximize its 
exclusive alternative) than by exerting con- 
trol in only one direction and comparing 
the results against chance. 


Method 


Subjects. Thirty-two undergraduates at Duke 
University, males and females, served as subjects. 

Procedure. The same apparatus and set of five 
test problems were used as in Experiment I. All 
subjects were run individually in the active posi- 
tion, and all made a judgment of control on a 
scale as in Experiment I. Half of the subjects ran 
under the score instruction and half under the 
control instruction. 

Under each instructional condition, half of the 
subjects answered the random-player questions, 


and half of them answered the switehed-intentiogn 
questions, 


No statement was made concerning the possibil 
that the outeome which appears on a trial 
pends on response choices for preceding t 
For the control instruction, greater emphasis 
placed on the fact that the call buttons were onl 
indicators of intention and had no effect on out 
comes. The subjects were required to state ca 
rectly the definitions of complete control, no eo 
trol, and partial control prior to beginning 
first problem. Р 
The additional questions were explained (о h 
subject in advance as was the scale for the 
judgment of control. 
Design. Eight subjects were run in each of tl 
four experimental conditions (score and contm 
instructions each with switched-intention am 
random-player questions). The subjects were 
signed to these conditions at random, Within 
group every subject received the five problems 
a different order, creating an approximate balan 
in the frequencies with which problems appe 
in each ordinal position. Eight different randon 
tapes were used for each problem. The same sé 
of orders and randomizations was used in e 
of the four groups. 


Results апа Discussion 


Relation of АР” to Judged Control. 
significant correlation between judgmei 
on the scale of control and АР” values w 
found within any of the four experimental 
groups. Indeed, the scale values for judge 
control associated with AP’ values of zet 
were scattered over the range of the scal 
from zero control to almost complete con- 
trol. 4 
The scale judgments compared quit 
closely with those obtained in Experiment 
I, and appeared to follow the same increas: 
ing trend with number of successes. 
AP’ values showed no relation to successes 
nor to actual contingencies. For example 
the number of subjects who gave highe 
mean values for AP’ on contingent than on 
noncontingent problems was no greater 
than expected by chance (18 out of 32 was 
obtained against the chance expectation of 
16). 

The AP’ values were extremely erratic. 
Whereas a representative value for the co- 
efficient of variation based on judgment of 
control would be 50%, a typical value fo 
AP' would be in excess of 10095. Further 
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whereas Kendall's index of concordance, W, 
based on the rank orders given to problems 
by different subjects, showed significant 
concordance in all four experimental groups 
when the ranks were based on judged con- 
trol (p < .01 in each case) it failed to reach 
significance in any group when the ranks 
were based on AP". Thus, we are unable to 
distinguish with confidence the rankings for 
problems based оп AP” values from random 
assignments of ranks to problems. Con- 
trary to expectation, the overall mean 
value of AP’ based on the switched-inten- 
tion question was not significantly higher 
than that based on the random-player 
questions. 

The lack of correlation between judged 

control and AP" values leads to the conclu- 
sion that the subject’s concept of control is 
not typically equivalent to "the ability to 
alter outcomes." Further, the lack of rela- 
tion between AP’ and actual contingency 
indicates that the subjects do not have a 
correct appreciation of their ability to alter 
outcomes. 
_ Problem of Instructions. The possibility 
remains that the subjects do have a concept 
similar to that of control in the sense of 
contingency, but that the instructions have 
failed to bring that concept into play. 

The question of how to instruct is a par- 
ticularly difficult one in the present con- 
text. It would, of course, be possible to in- 
struct the subject in an explicit procedure 
for calculating or estimating AP’, but this 
would tell us very little about his everyday 
concept of control, The approach taken in 
Experiment III to the problem of how to 
give a clearer instruction without providing 
an explicit rule of calculation was to give 
the subject prior experience with an ex- 
ample of zero control and of strong control. 
If the subject does have a concept of con- 
trol in the sense of contingency, perhaps it 
can be brought into play by this small 
amount of pretraining. 


EXPERIMENT ШІ 


It would appear from the results of the 
two previous experiments that the examples 
used for the purpose of instruction must, if 
they are to lead to valid judgment, counter- 


act the tendency to judge control in terms 
of successes, Further, if the AP” values аге 
to become consistent with judged control, 
it will be necessary to jointly specify cor- 
rect values for judged control on the scale, 
and to correct answers to the additional 
questions from which the AP" values are 
obtained. 

This line of reasoning led to the use of 
three different types of pretraining. One 
group of subjects received examples which 
were chosen so that the number of successes 
that would be achieved was correlated with 
the correct values for the amount of con- 
trol. In this group, judged control on test 
problems should show, as in the previous 
experiments, a dependence оп successes 
since the pretraining does nothing to undo 
the relation of judgment to success. In 4 
second group, the pretraining examples 
were chosen so that the number of successes 
would not vary with the correct values for 
judged control. This might lead to valid 
judgment since the tendency to judge on 
the basis of success should be counteracted. 
Finally, a third group received the same 
pretraining examples as did the second 
group, but in addition, was given correct 
answers to the questions from which the 
AP’ values are computed. Valid judgments 
of control are, presumably, most likely in 
this group, since correct estimates of the 
alterability of outcomes are given together 
with the correct values for judged control. 
The equivalence of control and alterability 
should be emphasized by this procedure. 

All subjects were given the control in- 
struction. They worked two pretraining 
problems. In one of these there was no con- 
tingency, while in the other there was a 
rather strong contingency. The correct 
judgments for these problems were shown 
to the subject in advance. The presence or 
absence of a correlation of success with 
control was manipulated by variations in 
the pretraining problem which exemplified 
no control. It was known from Experiment 
I that under the instruction to control out- 
comes, the subjects tend to match the fre- 
quency of calling a given outcome to the 
frequency with which that outcome ap- 
pears. As a consequence, the number of suc- 
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TABLE 4 
Mean Buocrsses ох PmeTRAINING PROBLEMS 
| == = 
д EN c y 
I 28.4 — 42.8 
п — 41.4 41.2 
ш — 42.1 45.4 


cesses increases with an increasing pre- 
dominanee of one outcome. It was also 
known that the subjects take advantage of 
a moderately strong contingency of out- 
comes upon responses to increase the num- 
ber of successes. Therefore, it should be 
possible to produce about the same number 
of successes in a noncontingent problem 
with disproportionate outcome frequencies 
as in a moderately contingent problem. The 
pretraining problems for Groups П and III 
in the present experiment were selected to 
produce this result, and thus to remove any 
correlation between contingency and the 
mean number of successes. 

On the other hand, when the noncontin- 
gent pretraining problem has equally fre- 
quent outcomes, the mean number of suc- 
cesses will be less than for the contingent 
problem. Contingency and the mean num- 
ber of successes will, in this case, vary to- 
gether. The pretraining examples in Group 
I were selected to achieve this result. 


Method 


Subjects. Twenty-four undergraduates at Duke 
University, males and females, served as subjects. 

Procedure. The apparatus was the same as in 
previous experiments. The same five test prob- 
lems, problem orders, and randomizations were 
used аз in Experiment II. 

The pretraining problems each consisted of 
60 trials. Their statistical structure is shown in 
Table 1. All three groups received, as an example 
of moderately strong control Problem Y' which 
is similar to Problem Y of the test series, but 
represents а somewhat stronger contingency. It 
should, on the basis of previous results, yield about 
40 successes in 60 trials. The example of zero 
control for Group I was Problem A’. It is identical 
with Problem A of the test series, it has equal 
outcome frequencies, and it will yield an average 
of about 30 successes in 60 trials. In. Groups II 
and ш the example of zero control was Problem 
C' which is similar to Problem C of the test series, 
but it has slightly more disproportionate outcome 
frequencies. Results of the previous experiments 


prediet an average of about 40 successes for 
Problem C*. 

Instructions, All subjects were run under the 
modified control instruction used in Experiment 
II. They all answered the random-player questions 
in addition to making a judgment of control by 
marking the seale. Prior to working pretraining 
problem Y', the subject was told: "You will have 
very good control over the outcomes by your 
choice of responses." He was then shown а sample 
answer sheet with the scale of judged control 
marked at 80, Prior to working the pretraining 
problem exemplifying zero control the subject 


was told: ^Your choice of responses will have no 
influence over which outcome will appear." He 
was then shown a sample sheet with the scale of 
judged control marked at zero. 


Group III received the following additional 
information concerning correct answers to the 
random-player questions. For Problem Y’, the sub- 
ject was told : 

If you were to try for the circle on each of 60 

trials you could make it appear 54 times. А 

coin-flipping player would get the circle only 

30 times. If you had decided instead to go for 

the square, you could make that outcome appear 

54 times out of the 60 trials. And, of course, 

the coin-flipper would get just 30 squares. 
For Problem C’, the subject was told: 

No matter which outcome you try for, the 

square will appear 54 times in the 60 trials. The 

random player would also get 54 squares in the 

60 trials. 

Design. Eight subjects were run in each group. 
Four different randomizations of each pretraining 
problem were used. For Problem C’, O: appeared 
most frequently for half the subjects while O: 
appeared most frequently for the remaining sub- 
jects. For Problem Y’, В, led to О, most frequently 
for half of the subjects, while Re led to О, most 
frequently for the remaining subjects. 


Results and Discussion 


Successes on Pretraining Problems. The 
mean number of successes on pretraining 
problems is shown in Table 4. The results 
were as anticipated. The mean number of 
successes on Problem A’ was well below the 
value for Y’, producing the desired correla- 
tion between success and control in Group 
I. The mean number of successes on Prob- 
lem C’, on the other hand, was very close 
to the value for Y’. Therefore, as was in- 
tended, successes did not vary systemati- 
cally with control in Groups II and III. 

Judgment of Test Problems. 'The discus- 
sion of results centers on the effects of the 
experimental conditions on the correlations 
between the variables which appear in the 
column headings of Table 5. 
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The mean correlation of judgment with 
sguceess was significantly higher in Group I 
than was the overall mean correlation for 
II and III combined (p < .025, 
“one-tailed U test based on individual rho's). 
"Thus we obtained the anticipated reduction 
in the correlation of judgment with success 
` as the result of identifying a problem which 
frequent success as one having zero 
control. Ranks based on the judgment of 
control showed significant concordance in 
Е" І (р < .05), but not in Groups П or 


1 The correlation of AP’ with success was 
moderately high in Groups I and П. It was 
lower and not significantly greater than 

, zero in Group III in which the correct aP 

values were specified in pretraining. The 
rankings based on AP’ were not signifi- 
| cantly concordant in any group. 

L. The mean correlation of judgment with 

AP' was substantial in Groups I and III 
and significantly lower than for either of 
` these in Group II (p < .025 by one-tailed 
` U test). An interpretation of this pattern 
. is that the correlation is high in Group I 

largely because both judgment and AP” are 
- following success: a result also obtained in 

‘the comparable group in Experiment Il. 

The correlation is high in Group ПІ be- 

cause of the joint specification of correct 

values for AP’ and for judged control in 
pretraining. In the absence of these special 
circumstances AP’ and judged control are 
not in agreement. 
The correlation of judged control and of 
» AP' with the AP index of contingency is 
given in the last two columns of Table 5. 
These correlations are low, and in all cases 
they are not significantly different from 
zero. It is apparent that the pretraining 
conditions for Groups П and III yield no 
improvement in the validity of judgment 
or of AP’ values. 

The results of Experiment III go against 
the notion that the failure to find valid 
judgments in the first two experiments was 
due to a lack of communication. Even with 
appropriate pretraining, no significant cor- 
relation appears between the AP index of 
actual contingency and judged control, or 
# between the AP index and the subject's es- 


TABLE 5 
Means or IxpivipvAL no's 


ڪڪ 


Group. — - 
J5 ars jar jar aPar 
I .7* .06* 91° 25 01 
u 18 ,AS* 15 .01 16 
ш .38 ^4 73° 25 02 


Note.—Abbreviations are: J-—judgment, and 
S—success. 

* Significantly different from 0 at 05 level by 
binomial test for number of individual rho's > 0 


timates of his ability to do better than a 
random player. 

The conditions of pretraining did, how- 
ever, have an, effect. When the pretraining 
problems were selected to produce а co- 
variation of successes with actual control 
(Group I), judged control increased with 
successes, a replication of the results of the 
first two experiments. When, however, the 
pretraining problems produced the same 
mean level of success on the example of no 
control as on the example of strong control, 
the tendency to judge control on the basis 
of successes was removed. 

The results for the AP’ values are similar 
to those of Experiment II. These values 
again fail to show significant concordance; 
and in Group II, they also fail to correlate 
significantly with judged control. Thus it 
is again found that the subject’s judgment 
of control does not predict his estimate of 
the degree to which he can alter outcomes. 


QUESTIONNAIRE RESULTS: EXPERIMENTS 
I, II, and Ш 


In each experiment subjects answered 
a written postexperimental questionnaire 
which elicited certain background data and 
information concerning the basis of judg- 


ment. 

Of the total of 102 subjects in all experi- 
ments, 19 reported having taken at least 
one college-level course in statisties. Seven- 
teen of these were in Experiment I. The 
pattern of judged control for these subjects 
was similar to that given by subjects with- 
out statistical training. 

All postexperimental questionnaires con- 
tained the following item: 
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TABLE 6 
AP INDEX, Successes, MEDIAN JUDGMENT OF 
CONTROL, AND NUMBER OF SUBJECTS ELECTING 
TO JUDGE FOR HYPOTHETICAL PROBLEMS 
IN THE POSTEXPERIMENTAL QUESTION- 
NAIRE, EXPERIMENT II 


Score instruction Control instruction 


1 З Mdn. 
lem Suc- Ns. Na Suc. * Judg- N* 
ment ment 
1 0 48 77 15 39 50 12 
2 5 35 35 13 32 5 12 
3 7 40 50 12 18 30 13 


a N of subjects electing to judge. 


If you were to observe someone else working a 
problem in which on every trial he made the same 
response and got the same outcome, how much 
control would you say he had over the outcomes? 


The subjects checked the following alterna- 
tives with the frequencies given in paren- 
theses: complete control (49), mediwm con- 
trol (15), no control (8), uncertain (30). 

In Experiment I, in which the possibility 
of sequential dependencies was stated ex- 
plicitly, 40 of the 46 subjects reported their 
belief that outcomes on a trial depended in 
various ways on the events of preceding 
trials. In Experiments II and III, no state- 
ment was made to the subject about se- 
quential dependencies. In Experiment II, 
16 of the 32 subjects indicated their belief 
in such a dependency; while in Experiment 
III, 10 of the 24 subjects did so. Inspection 
of the judgments made by those who re- 
ported sequential dependencies and by 
those who did not, revealed no systematic 
differences. 

The questionnaire for Experiment II in- 
cluded hypothetical sets of data for three 
60-trial problems. For subjects under the 
score instruction these data were the four 
cell frequencies of the 2 x 2 table. For 
those under the control instruction, fre- 
quencies were given for the eight possible 
call-response-outcome combinations. The 
subjects were asked to make a judgment of 
control for these data. However, they were 
given the option of omitting judgment on 
any problem for which they had no idea 
how to proceed. The results are shown in 
Table 6. The influence of success on judg- 


ment of control and the failure of judgment 
to parallel the AP index of contingency is 
evident from these results. 

The appearance of a tendency to judge 
control on the basis of success on the ques- 
tionnaire is interesting. It parallels the re- 
sults for the test problems and thus suggests 
that these results may be quite general 
rather than being a consequence of certain 
special features of the present experimental 
task. The subject is not required to retain 
serially acquired information in answering 
the questionnaire, nor does he have any 
role in producing the data to be judged. 
Perhaps most significant is that the rela- 
tion of judgment to success appears even 
when the subject is provided with a tabu- 
lation of event frequencies in the appro- 
priate categories. When the judgment is 
made on the basis of a tabular summary of 
frequencies rather than after experience 
with a series of events, there is no oppor- 
tunity for erroneous beliefs about the effi- 
cacy of patterns of choices to enter into the 
judgment. Since the relation between suc- 
cesses and judged control continues, it 
would appear that the belief in response 
patterns is not a necessary condition of the 
correlation of judgment with success. 


CONCLUDING Discussion 


The main finding of these experiments is 
that the amount of judged control was а 
function of the frequency of successful out- 
comes rather than of the actual dependency 
of outcomes upon responses. The relation 
of successes to judgment is robust since it 
appears when the subjects work with neu- 
tral outcomes (control instruction) as well 
as with favorable and unfavorable out- 
comes (score instruction). It appears when 
the relevant events are simply observed, as 
well as when they are produced by re- 
sponses. Further, the subjects who kept 
trial-by-trial records were no less subject 
to the effect than those who relied on their 
unaided memory. Finally, successes con- 
tinued to have their effect when the judg- 
ments were made from an appropriate sum- 
mary tabulation of the events rather than 
from an unprocessed trial-by-trial se- 
quence. 

The fact that the subjects mark a scale 
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JUDGMENT or CONTINGENCY 


in response to a question about control does 
f not mean that they have a concept of con- 
trol which entails the core concept of con- 
tingency. It now seems unlikely that the 
typical subject in these experiments has 
such a concept. Not only is a high degree 
of control often judged in the absence of 
contingency, but the judgment, however 


arrived at, does not consistently imply for 
these subjects the ability to alter outeomes 


ы 


through response choices. This conclusion 
is based on the failure of judged control to 
stand in any sensible relation to the sub- 
ject’s estimate of his ability to out perform 
а random player or to change at will the 
proportion of outeomes of one kind. While 
it may be that the concept of contingency 
could have been evoked by other questions 
and instructions, the failure of pretraining 
in Experiment III to do so argues that a 
simple lack of communication was not re- 
sponsible. 

The conclusion that the subjects in the 
present experiment were without a concept 
of contingency is not intended to preclude 
the possibility that far more valid judg- 
ments of the same statistical structures 
could be made if the events were cast in a 
different context. An example of such a 
context might be one in which inputs were 
represented as the presence or absence of 
a drug and outputs as recovery or nonre- 
covery from infection. A conclusion that is, 
however, warranted by the results of this 
experiment is that the typical subject in 
this population did not have an abstract 
appreciation of statistical contingency. As 
has been noted, Smedslund (1963) stated a 
Similar conclusion: 


normal adults with no training in statistics do 
not have a cognitive structure isomorphic with 
the concept of correlation. 


It might be added from the results of the 
present experiment that training in statis- 
ties will often fail to improve matters. 

We are left, however, with the finding 
of Inhelder and Piaget to the contrary, 
namely, that correlational reasoning often 
appears by the age of 14 or 15 years. It 
has been seen, however, that the formula- 
tion of the concept of correlation by these 
authors is correct only for a special case 
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and that it is doubtful that their subjects 
were using the more general concept of cor- 
relation. Further, the context and the 
method of obtaining the judgment in the 
present experiments were quite different 
from those in Inhelder and Piaget. Differ- 
ences of this kind may have strong effects 
upon the level of reasoning in the judgment 
of relationship. 

If one were to generalize the present re- 
sults broadly, one would be left with the 
puzzle of how people get along as well as 
they do even though they are unable to 
judge correctly that some event is con- 
trolled by or, on the other hand, is inde- 
pendent of some other event. Surely the 
distinction has implications for adaptive 
behavior. The puzzle may be lessened by 
considering some of the ways in which the 
present experimental task is not represen- 
tative of the natural conditions of such 
judgments. Two features, which may be 
particularly important, are the absence of 
relevant temporal variations between in- 
put and output, and the discrete nature of 
the binary input and output. 

The temporal succession of input and 
output was the same for all degrees of con- 
tingency and was thus irrelevant. Under 
natural conditions, however, temporal prox- 
imity is undoubtedly an important deter- 
miner of the judgment or perception that 
events are related. It appears to be gen- 
erally the case that events which are in a 
statistical sense highly contingent upon 
some antecedent, also tend to follow that 
antecedent closely in time. 

In the present task, the input events Ry 
and Ro were discrete as were the outputs 
О, and Os. As a consequence, three states 
were actually involved in both the input 
and output since there was also the be- 
tween-trial state in which neither of the 
alternate inputs and neither of the outputs 
occurred. This is to be contrasted with the 
case in which the input and output are of 
the on-off type, so that a momentary input 
appears against a background of nonoccur- 
rence and is followed, with some probabil- 
ity, by an output event which also appears 
as the interruption of a resting state. Per- 
haps such a context is more representative 
and would lead to more valid judgment. It 
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is also in this context that temporal vari- 
ables would normally become important. 

The features of the present task which 
seem to militate against valid judgments 
are related to the rough distinction made at 
the outset of this report between the per- 
ception and the judgment of a relation. It 
was suggested that the term perception ap- 
plies to the case in which the awareness of 
a relation follows immediately upon the 
joint oceurrence of two events which rarely 
appear alone, whereas the term judgment 
applies when antecedent and consequent 
often occur alone and a series of observa- 
tions is necessary in order to estimate 
whether the frequency of joint occurrence 
exceeds the chance expectation of joint oc- 
currence. The distinction is, of course, re- 
lated to degree of contingency, since’ the 
conditions under which one may speak of 
the perception of a relation are generally 
those associated with strongly contingent 
events. One could still speak in such cases 
of а comparison of conditional probabili- 
ties, ie. of the probability of the event 
with, as against without, the antecedent. 
However, the appreciation of the probabil- 
ity of the event in the absence of the ante- 
cedent would be based on prior experience 
of long standing and would be more like an 
expectation than an estimate. Thus, from a 
psychological point of view, it may be mis- 
leading and artifieial to view the percep- 
tion of contingency in terms of а compari- 
son of probability estimates. 

The present results do not in fact sug- 
gest that comparisons of probabilities 
played much, if any, role in mediating the 
judgment. Rather, it appears that instead 
of making comparisons between events 
within the task, control was judged in terms 
of the degree of success in the performance 
of the task as a whole. It was, for example, 
not uncommon for subjects to speak of 
having control over just one of the out- 
comes; a remark which is incompatible 
with a judgment based on differential ef- 
fects of responses within the task itself. It 
is as though the subjects were evaluating 
their performance against some expectation 
of how often a favorable event would occur 
if responses had no control over outcomes. 


Many subjects were apparently judging 
control against a base-line expectancy of 
zero successes. Even with only 8 successes 
in 60 trials (Problem C under score instruc- 
tion) almost half of the subjects judged 
some nonzero level of control. 

An expectation of zero successes in the 
absence of control could be understood as a 
generalization from common experience. In 
ordinary commerce with the environment 
the joint oceurrence of some action and a 
favorable event (or, more broadly, an event 
upon which attention has been focused) al- 
most always represents a contingent or 
causal relation. Chairs do not often move 
unless pushed, lights do not often come on 
until the switch is thrown, and so on. In 
these cases the assumption that the event 
never occurs until caused is generally cor- 
rect. Control over a single outcome is per- 
ceived against a resting state of no occur- 
rence. When the assumption of a zero base 
line is altogether inappropriate, such as in 
games of chance, casual observation as well 
as the present results suggest that erroneous 
beliefs in controlling or contingent relations 
are prevalent. 


SUMMARY 


Three experiments were conducted to ex- 
amine subjects’ beliefs in the degree of con- 
trol exerted over outcomes through response 
choices when outcomes were or were not 
contingent upon responses. 

All subjects worked a set of two contin- 
gent and three noncontingent problems in а 
two-response, two-outcome situation. After 
each problem the subject judged the degree 
of control exerted by his responses over 
outcomes and in Experiments II and III 
also made certain estimates relating to his 
ability to manipulate outeome frequencies. 
The subjects were told in advance that they 
were to judge control and that for some 
problems the correct judgment might be 
one of zero control. 

In Experiments I and II, judgments were 
obtained from subjects who made response 
choices or who were only spectators. Judg- 
ments were also obtained under the instruc- 
tion to produce as many scores as possible 
(score instruction) or under the instruction 
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` to control the appearance of neutral out- 


comes (control instruction). In either case, 
certain events may be defined as successes. 
Under the score instruction a success is 
simply the appearance of a “score” light; 
while under the control instruction it is an 
agreement between the outcome which the 
subject is trying to produce on a given trial 
and the outcome which appears. 

The judgment of control was positively 
correlated with the frequency of success in 
all conditions. It was not systematically in- 
fluenced by the presence versus the absence 
of contingency or by the other experimental 
variations except insofar as these affected 
the frequency of success. 

In Experiment III the effect of a lim- 
ited amount of pretraining on judgment 
was examined. All subjects worked a con- 
tingent and a noncontingent sample prob- 
lem for which correct judgments were spec- 
ified in advance. The use of pretraining 
problems in which the frequency of success 
was greater on the contingent than on the 
noncontingent sample resulted in a correla- 
tion of judged control with success similar 
to that obtained in the previous experi- 
ments, When the structure of the pretrain- 
ing problems led to approximately the same 
number of successes on contingent and 
honcontingent problems, the correlation of 
judgment with success was significantly re- 


duced. The validity of judgment was, how- 
ever, not improved. 

Whereas there was significant concord- 
ance among the subjects in their judgment 
of control (except where pretraining re- 
duced the correlation of judgment with suc- 
cess) a measure of manipulatability de- 
rived from the subjects’ estimates of their 
ability to produce given outcomes consist- 
ently failed to show a significant degree of 
concordance. This measure was generally 
not in agreement with judged control, nor 
did it accurately reflect the presence versus 
the absence of contingency. 

The consistent failure to discriminate 
contingent from noncontingent structures 
and the lack of agreement of formally 
equivalent measures of the subject’s beliefs 
concerning the control of outcomes by re- 
sponses suggests that erroneous beliefs con- 
cerning control may be traced to the ab- 
sence of a statistical concept of contingency 
in untutored subjects. There is suggestive 
evidence that the subjects do not distin- 
guish the ability to manipulate outcomes 
from their ability to predict them. The base 
line against which the subjects assess their 
performance appears to be one of zero oc- 
currence of the event of interest in the ab- 
sence of personal causation. This base line, 
which is inappropriate in the present con- 
text, may arise through a generalization 
based on everyday experience. 
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The problem of the study reported in this monograph was twofold: to con- 
struct a scale for the measurement of the maturity of vocational attitudes 
in adolescence, and to develop some new methods for the assessment of 
developmental phenomena. A 100-item questionnaire, the Attitude test of 
the Vocational Development Inventory, was administered to 2822 Ss classi- 
fied by 1-year age intervals between 11-5 and 17-6 and by Grades 5 through 
12. The major findings indicated that (a) verbal vocational behaviors are 
monotonically related to both age and grade, but are more frequently asso- 
ciated with the latter than the former; (b) trends in responses over age 
and grade are from True to False; and (c)'stages in the maturation of 
vocational attitudes are primarily associated with the transitional points 


in the educational system—elementary vs. junior high school vs. senior high 


school. 


F= nearly a half century in the field of 
vocational psychology—from the pio- 
; neer work of Parsons (1909), through the 
writings of the Minnesota psychologists 
(Paterson & Darley, 1936; Williamson, 


lThis is the first report of a series of cross- 
sectional and longitudinal studies of vocational 
development sponsored jointly by the United 
States Office of Education and the University of 
Towa as part of the Cooperative Research Pro- 
gram. 
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1939, 1950), to the research of the Aviation 
Psychology Program (Flanagan, 1948) and 
the United States Employment Service 
(Dvorak, 1947)—the predominant concep- 
tion of vocational choice was essentially a 
cross-sectional, nondevelopmental one. Ex- 
trapolated from counseling experience with 
“matching men and jobs" (Paterson, 1949) 
and founded upon the procedures and prop- 
ositions of *"trait-and-factor" theory (Pe- 
pinsky & Pepinsky, 1954), it emphasized 
the ahistorical, instantaneous, nondynamie 
elements in vocational decision making. 
Resolution of the problem of choosing an 
oceupation, whether before or after entry 
into the world of work, was seen as à point- 
in-time event when the individual, more or 
less consciously and rationally, appraised 
his personal assets and liabilities, surveyed 
the employment opportunities open to him, 
and decided upon the one which offered him 
the greatest chances for job satisfaction and 
success. Epitomized by the picture, often 
found in the guidance literature of the 
1930s and 1940s, of a young man or woman 
deliberating about which career path to 
follow at the “erossroads” of life, the tra- 
ditional view of vocational choice has 
highlighted the time-bound, largely static 
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nature of how and when people select oc- 
cupations. 

In contrast, as early as 1940, but espe- 
cially since 1950, there has been an increas- 
ing focus upon a different, although not 
necessarily an antithetical (Beilin, 1955), 
conception of vocational choice as a long- 
term, developmental process rather than a 
momentary or transitory phenomenon. 
Presaged by Carter's (1940) hypotheses on 
the formation of vocational attitudes and 
Super's (1942) discussion of vocational ex- 
ploration and establishment as life stages, 
eurrent explanations of career choice pro- 
pose that the individual makes not one but 
a series of related decisions which culmi- 
nate in the eventual selection of an occu- 
pation and entry into it (Dysinger, 1950; 
Ginzberg, Ginsburg, Axelrad, & Herma, 
1951; Super, 1953). According to most the- 
ories of vocational development, the choice 
process spans the years of adolescence, from 
approximately age 10 to age 21, and is di- 
vided into several discernible periods or 
stages, each of which is characterized by 
systematic changes in the individual’s deci- 
sion making and planning. The major be- 
havioral dimensions along which vocational 
development has been postulated as pro- 
ceeding are the following: realism of voca- 
tional choice (Ginzberg et al., 1951); con- 
sistency of vocational choice, crystallization 
of vocational traits, and maturity of atti- 
tudes toward vocational choice (Super, 
1955); and, clarification of the voca- 
tional self-concept (O'Hara & Tiedeman, 
1959). Thus, within a developmental con- 
ceptual framework, vocational choice is not 
a single, isolated act of the individual: it is 
a comprehensive, multifaceted, ongoing 
process which encompasses many interre- 
lated behaviors of the individual at various 
points in his prework life. 

Unlike the older conception of vocational 
choice, which was largely inductive in ori- 
gin, having been defined operationally and 
correlated with other variables in numerous 
studies (Roe, 1956; Super, 1957), the newer 
developmental point of view is primarily 
deductive in nature, having been inferred 
from more general theories of development 
and personality (Super & Bachrach, 1957). 
As a result, developmental hypotheses 


about the dynamics of vocational choi 
tend to be more explicit and provocative 
than the null hypotheses of the trait-and. 
factor approach; but they are also less test. 
able, since many of the terms in them havé 
not as yet been given empirical meaning 
(Borow, 1960; Hall, 1963). Ginzberg, Gins- 
burg, Axelrad, and Herma (1951) have pro- 
posed that the choice process ends in # 
compromise, or “balance”, between the in- 
dividual’s personal capabilities and tht 
limitations imposed by his environment; 
but, as Super (1953) has noted, 


the nature of the compromise between self and 
reality, the degree to which and the conditions 
under which one yields to the other, and the way 
in which compromise is effected [p. 187] 


have not been sufficiently analyzed or де 
lineated to allow the construction of appro- 
priate measuring instruments? Similarly, 
the concept of vocational maturity (Super 
1955) has not been translated into be 
haviors which are unambiguously definel 
or which have demonstrated empirical sig 
nificance. Not only are there five different, 
definitions of voeational maturity, but nont 
of them has been shown to be related tt 
age, which is a necessary, although not su 
ficient, condition for the measurement 0 
any developmental variable (Crites, 1961). 
Likewise, problems remain in the asses 
ment of the vocational self-concept and tht 
relationship it bears to age (O'Hara 6 
Tiedeman, 1959). As presently defined, 0 
terms of agreement between self-estimatel 
and actual test scores, there is no way t 
determine whether trends in the various 4% 
pects of the vocational self-concept ova 
time are a function of the procedures usel 
to measure it, e.g., the differential difficult! 
of estimating scores on tests of aptitude’ 
interests, and values, or the actual effect 


*Blau, Gustad, Jessor, Parnes, and Wilco) 
(1956) have suggested an approach to the mes 
urement of compromise but only in general e 
They write: : 

То study this process, repeated intensive inte 

views with entrants into the labor market wo" 

have to discern how modifications in оссирЕ 

tional expectations and values are produced b 

various social experiences, such as inability t 

get a job, expulsion from professional or vou f, 

tional school, being repelled by unanticipy 

aspects of the work, and many others [p. 
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of development as the individual grows 
older. 

' Until the central concepts and terms of 
vocational development theories are opera- 
tionally defined in satisfactory Ways, 
whether by test or nontest methods, it will 
not be possible to evaluate empirically the 
major propositions which have been ad- 
yanced to account for individual differences 
in vocational behavior. Borow (1961) has 
observed, 


Very soon now we must begin to specify and in- 
stall the methodological ground rules by which 
research in career development сап be produc- 
tively accomplished [pp. 23-24]. 


Among the recommendations which he 
makes, the following two pertain directly 
to the problem of devising adequate meas- 
ures of vocational development variables: 


1. We must learn to make a much clearer sepa- 
ration between observation terms and hypotheti- 
cal terms in talking about career development 
(Borow, 1956; Bugelski, 1960). In this connection, 
we shall have to strive to erect some workable 
coordinating definitions that will link data-based 
variables to noninstantual variables as, for exam- 
ple, the link between verbally expressed vocational 
indecision and inferred role conflict. 

2. We must heed the admonition of Butler 
(1954), Travers (1959), and Loevinger (1957) that 
conventionally developed psychometric instru- 
ments are not logically constructed to help us 
with our newer, conceptually rich research ques- 
lions. We need, as Travers (1959) has stated in 
reviewing Loevinger's important monograph, in- 
struments based on a theory of test scores which 
has meaning for a theory of behavior in nontest 
situations, In short, we need tests which can be 
eed in construct validation [pp. 23-24]. 


It is apparent that a considerable amount 
of technique research must be completed 
before meaningful and reliable critical re- 
Search on hypotheses about vocational de- 
velopment can be conducted (Edwards, 
1954), 

Accordingly, an extensive longitudinal 
study of vocational development was 
planned at the University of Iowa, in which 
the first phase was to be devoted entirely 
to instrumentation. The purpose of this 
Stage in the overall research project, which 
will be followed by others on theory testing 
and practical applications, was twofold: 
‘first, the basie objective was to construct 


and standardize a measuring procedure 
which would operationally define one of the 
more potentially useful concepts in the var- 
ious theories of vocational development. 
Because of its comprehensiveness as an ab- 
straction of several developmental princi- 
ples, e.g., the increasing goal directedness, 
independence, and realism of vocational be- 
havior (Super & Overstreet, 1960), and be- 
cause it enters into a large number of 
theoretical propositions, the concept of vo- 
cational maturity was selected as the one 
which appeared to have the greatest prom- 
ise for further definition and measurement. 
Second, a more general goal which it was 
hoped could be attained through the analy- 
sis and assessment of vocational maturity 
was to make a*contribution to theory and 
methodology on the measurement of de- 
velopmental variables. A commitment was 
made to incorporate and evaluate as many 
of the newer psychometrie principles (e.g., 
Berg, 1959; Fiske & Butler, 1963; Ghiselli, 
1963) as possible in designing a multidi- 
mensional test of the behaviors subsumed 
by the concept of vocational maturity. The 
instrument which has been constructed in 
accordance with these purposes is the Vo- 
cational Development Inventory (VDI), 
which is comprised of two measures of vo- 
eational maturity—the Attitude test* and 
the Competence test. This monograph re- 
ports the initial data on the construction 
and standardization of the Attitude test. 


CONCEPT oF VOCATIONAL MATURITY 


Although the concept of vocational ma- 
turity is a relatively new one (Super, 
1955), it has antecedents in the early work 
on vocational interest measurement as well 
as the more recent theory construction on 
vocational development. Carter’s (1940) 
classic research on the patterning of in- 
terests in adolescents led him to conclude, 


The development of vocational interests involves 
interactions between growth processes, some of 
which are educationally controlled and some of 
which are biologically controlled.... Growth in 
this field is a part of general maturation, of de- 
veloping individuality [p. 187]. 


‘The Attitude test was originally entitled the 
Concept test (Crites, 1964). 


à Јонх О. 
Strong's (1943, 1955) studies of his Interest 
Maturity seale, which reliably differenti- 
ates the likes and dislikes of adolescents 
and adults, have also suggested that voca- 
tional behavior may change systematically 
with increasing age. More directly related 
to the formulation of the concept of voca- 
tional maturity, however, are the later the- 
oretical contributions of Dysinger (1950) 
and Ginzberg, Ginsburg, Axelrad, and 
Herma (1951). In arguing for a philosophy 
of vocational guidance grounded in the de- 
velopmental history of the individual, Dy- 
singer (1950) pointed out, 


The guidance movement needs a word, parallel 
to the word “socialization” in social development, 
to express the vocational implications of matura- 
tion. The terms “vocational decision” and “voca- 
tional choice” suggest a single decision, but the 
emphasis should be placed upon the develop- 
mental process [p. 198]. 


Similarly, Ginzberg, Ginsburg, Axelrad, and 
Herma (1951) have observed, 


To some degree, the way in which a young person 
deals with his occupational choice is indicative 
of his general maturity and, conversely, in assessing 
the latter, consideration must be given to the way 
in which he is handling his occupational choice 
problem [p. 60]. 


An explicit statement of the concept of 
vocational maturity was not made, how- 
ever, until Super (1955) delineated the 
behavioral dimensions and quantitative 
indexes which might be specified to define 
it operationally. Beginning with the theo- 
retical definition of vocational maturity as 
"the place reached on the continuum of 
vocational development from exploration 
to deeline [Super, 1955, p. 153]," he went 
on to enumerate and describe five di- 
mensions along which vocational behavior 
might mature during early adolescence and 
for which measures could be devised. These 
included the following: 

1. Orientation to vocational choice. De- 
fined as concern with the problem of voca- 
tional choice, and the use of resources in 
solving the problem. Measured by judges' 
ratings of interview protocols. 

2. Information and planning. Defined as 
specificity of information about the chosen 
occupation, and extent and specificity of 
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planning with respect to chosen occupation. 
Measured by judges’ ratings of interview? 
protocols. 

3. Consistency of vocational choice. De- 
fined as stability of vocational choices 
over time, and agreement among voca 
tional choices in field, level, and family. 
Measured by diserepancies between voca- 
tional choices elicited on different occa- 
sions and classified into different cells of 
Roe’s (1956) occupational classification 
scheme. 

4. Crystallization of traits. Defined as ex- 
tent to which vocationally relevant apti- 
tudes and personal dispositions, such as 
mechanical comprehension and work val- 
ues, have developed toward adult status. , 
Measured by judges’ ratings of interview 
protocols and standardized tests. 

5. Wisdom of vocational choice. Defined 
as extent to which vocational choice agrees 
with abilities, activities, interests, and 50- 
cioeconomic background. Measured by dis- 
crepancies between vocational choice and 
indexes of the various reality factors. 

It is clear from the definitions of thes, 
dimensions that the concept of vocational 
maturity is more comprehensive than voc 
tional choice, including not only the selet- 
tion of an occupation but also attitudes to 
ward decision making, comprehension ani 
understanding of job requirements, plan- 
ning activity and ability, and development 
of vocational capabilities. In fact, Supe 
and Overstreet (1960) have hypothesized 
that vocational maturity is a multidemen- 
sional construct, rather than one unitary 
variable; and Crites (1964) has elaborated 
upon their formulation by proposing that 
the “orientation to vocational choice," “i 
formation and planning,” and certain 4% 
pects of the “crystallization of traits” di 
mensions can be further analyzed int 
several different kinds of choice “compete? 
cies” and “attitudes.” Conceived of p" 
marily as cognitive or ego functions, choit 
competencies involve such mental processe 
as assimilating information about self an 
reality, resolving conflicts between alterna 
tive courses of action, establishing futu! 
goals, and relating means to ends throug 
planning. In contrast, choice attitudes a 
more conative in nature and refer to !! 
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Fic. 1. The construct of vocational maturity as derived from theories of vocational development. 


yolvement in the choice process, orientation 

toward work, independence in decision 
making, preference for choice factors, and 
conceptions of the choice process. Together 
with the “consistency of vocational choice” 
and “wisdom of vocational choice” dimen- 
sions, these choice competencies and atti- 
tudes can be thought of as comprising the 
construct of vocational maturity much as is 
depicted in Figure 1. 

In this diagram, the model which has 
been used is similar to the one suggested by 
Vernon (1950) for representing the struc- 
ture of intelligence as inferred from factor 
analyses, the major difference being that 
the construct of vocational maturity is not 
nearly as well substantiated empirically. 
The first level of the construct has been 
termed “degree of vocational development” 

| and is comparable to а general or “second- 
order” factor. This variable has been de- 

¢) fined in several different ways (Super, 

| 1955; Super & Overstreet, 1960), but prob- 
ably the most parsimonious definition is 
that it 


| refers to the maturity of an individual's voca- 
| tional behavior as indicated by the similarity 
| between his behavior and that of the oldest indi- 
| viduals in his vocational life stage [Crites, 1961, 
| р. 259]. 


| vidual's overall progress toward vocational 
| maturity within a given period, “degree of 
Vocational development” is hypothesized as 
being moderately positively related to each 


| As a suprafactor which denotes the indi- 
| 
| of the dimensions of vocational maturity, 


which can be thought of as group or “com- 
mon" factors. The expectation is that each 
of these dimensions is, in turn, correlated in 
the .30s or .40s with the others. For ex- 
ample, it seems reasonable to predict that 
relatively mature vocational choice atti- 
tudes may mediate not only consistent and 
wise (realistic) vocational choices but also 
the vocational choice competeneies which 
facilitate mature decision making. Finally, 
the dimensions are comprised of various 
specific variables which reflect the same 
pattern of interrelationships: moderate as- 
sociations between groups and fairly high 
ones within groups. ЇЇ adequate measures 
of these factors can be devised, and if they 
intercorrelate as shown in Figure 1, it may 
eventually be possible to graph an indi- 
vidual’s vocational maturity on a profile 
sheet (Super, 1955) which indicates both 
his degree and rate of vocational de- 
velopment (Crites, 1961) along each di- 
mension. 


MEASUREMENT OF VOCATIONAL MATURITY 


Very few measures of vocational matu- 
rity have been devised, and those that have 
been developed either suffer from serious 
shortcomings or need further research. In a 
study designed primarily for another pur- 
pose, i.e., to determine client satisfaction 
with counseling, Nelson (1956) has re- 
ported data on an operational definition of 
vocational maturity which is patterned 
after Super’s (1955) dimension of “wis- 
dom of vocational choice,” but which is 
also notably different from it. Nelson’s 
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system for classifying clients according to 
their vocational maturity consisted of only 
two categories: mature and immature. Op- 
erationally, a client was considered to be 
vocationally mature if any of his expressed 
interests, as elicited during the initial inter- 
view or on an information form, was “in 
harmony with his inventoried interests and 
tested aptitudes.” A client was classified as 
vocationally immature if he had no ex- 
pressed interests, or if his expressed interest 
was disconsonant with his inventoried in- 
terests or tested aptitudes. In his sample of 
88 clients, 67 males and 21 females, Nelson 
found that 54 (61%) were vocationally 
mature and 34 (39%) were vocationally 
immature. Although he had age data on his 
subjects, Nelson did not analyze the rela- 
tionship between vocational maturity, as he 
defined and measured it, and chronological 


Much more extensive research than Nel- 
son’s on the measurement of vocational ma- 
turity has been done as part of Super’s 
Career Pattern Study, a 20-year longitudi- 
nal investigation of the vocational develop- 
ment of a sample of males between the 
ages of 15 and 35 (Super, Crites, Hummel, 
Moser, Overstreet, & Warnath, 1957). As 
indicated above, indexes based upon a va- 
riety of data—from interview protocols, 
standardized tests, occupational classifica- 
tion schema, ete—have been constructed 
to assess the various hypothesized dimen- 
sions of vocational maturity. On the theo- 
retical expectation that the indexes “should 
have some amount of positive interrelation- 
ship to be considered measures of vocational 
maturity [Super & Overstreet, 1960, p. 49],” 
correlational and factorial analyses of data 
collected on 142 ninth-grade boys were con- 
ducted. In general, the correlations among 
the indexes were either nonsignificant or, 
contrary to prediction, low negative (e.g., 
“Consistency of vocational preferences” 
versus “Use of resources in orientation,” 
average r = —.20). The one exception was 
the dimension called “Orientation to voca- 
tional choice,” the two indexes of which, 
“Concern with choice” and “Use of re- 
sources in orientation,” tended to be posi- 
tively correlated with each other. Neither 
the components of these indexes nor their 


total scores were related systematically to 
chronological age, however, the correlations 
with the latter variable, the highest of whieh 
was —.17,5 being significant in only one 
instance. From these findings and those of 
the factor analysis, Super and Overstreet 
(1960) concluded that only four of their 
original 20 indexes of vocational maturity 
were construct valid and that these 

consist of one general factor, Planning Orientation, 
and three group factors which contribute differ 
ently to the four indices [p. 75]. 


There are several shortcomings of these 


` 
\ 


approaches to the measurement of voca. 


tional maturity which have limited their 
applicability and usefulness and which ас- 
centuate the need for further research, 
First, and probably most important, neither 
Nelson’s categories nor Super’s indexes have 
been shown to correlate with chronological 
age in a theoretically meaningful way. Any 
measuring device which purports to asses 
a developmental variable such as vocational 
maturity must yield scores, however, whieh 
either increase or decrease with age, i.e, 


they must be a monotonic function of age, 


during some given period of development. 
The scores may not correlate with age 
across the total span of development be 
cause certain behaviors may mature only 
at given times, according to what has been 
called the “principle of developmental pre 
eminence” (Beilin, 1955; Jersild, 1946). 
but during periods of change the всогё 
should be systematically related to age 
This means that in constructing a measure 
of vocational maturity a wide range of ag 
levels should be used in its standardization 
Second, although some aspects of vocation 

maturity, such as the consistency an 
realism of choice, will necessarily have to 
be assessed by nonpsychometric methods 
e.g., occupational classification schema, tht 
most satisfactory techniques for measurin: 


* This т is negative because, as Super and Over 
street (1960, p. 105) have explained, the youngê 
subjects in their sample were the brighter опе 
and since intelligence was positively correlate 
with the vocational maturity indexes, age had t 
be negatively related. In an unselected sampl 
the expectation would be that both variable 
would be positively associated with vocation 
maturity. 


choice competencies and attitudes would 
appear to be tests and inventories. Not only 
arethey more objective than interview notes 
and protocols, as used by Nelson and Super, 
but they also are more economical, сап be 
dministered to large samples, and yield 
on a generally higher level of measure- 
ment. Finally, only a few of the behaviors 
which theoretically comprise the construct 
of vocational maturity have had measures 
developed for them. As Nelson (1956) 
points out, his categories are restricted to 
the evaluation of choice realism, and Su- 
реге indexes are limited to the appraisal of 
‘orientation to vocational choice, at least in 
the ninth grade. Thus, a number of other 
choice competencies and attitudes remain 
to be measured before the concept of voca- 
maturity can be given fuller empirical 
‘meaning. 
Г 
5 CONTENT AND DESIGN OF THE 
| ATTITUDE TEST 
q 


E. VDI has been conceived and con- 
structed to measure more completely than 
ا‎ evious procedures the behavior domains 
‘choice competencies and attitudes in vo- 
tational maturity, which are re- 
spectively by two subtests—the Competence 
test and the Attitude test. The Competence 
will be dealt with in greater detail in 
Mer reports, but is briefly described here 
so that it can be contrasted with the Atti- 
tude test and thus sharpen the theoretical 
definition of the latter. In its first experi- 
mental form the Competence test consists 
1 of five parts, each of which is comprised of 
30 multiple-choice items with from three to 
five foils. Part I is the Problems test, which 
is designed to measure the ability to resolve 
conflicts between the factors in vocational 
choice. Part II is the Planning test, in 
which the task is to order scrambled series 


. of steps leading to various vocational goals. 


Part III is the Occupational Information 
test, which includes items on job duties and 
tasks, trends in occupations, and future em- 
ployment opportunities. Part IV is the Self- 
Knowledge test and is scored against stand- 
ardized test information for accuracy of 
estimated vocational capabilities. Part V 
«is the Goal Selection test, the items of 
which require the examinee to choose the 
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"best" (most realistic) occupation for а 
hypothetical individual who is described 
in terms of his aptitudes, interests, and 
personality characteristics. The functions 
or processes which are supposedly involved 
in taking the Competence test, then, are 
largely what might be designated as com- 
prehension and problem-solving abilities as 
they pertain to the vocational choice proc- 
ess 


In contrast, the Attitude test was de- 
signed to elicit the attitudinal or disposi- 
tional response tendencies in vocational 
maturity which are nonintellective in na- 
ture, but which may mediate both choice 
behaviors and choice aptitudes. The items 
for this test were developed from a combi- 
nation of the best features of the empirical 
and rational methods of test construction. 
The empirical approach to item selection 
has the advantage that only items which 
are valid in differentiating between criterion 
groups, and which are properly eross-stand- 
ardized, are selected for the test (Meehl, 
1945). But, it has the distinct disadvantage, 
particularly in measures of constructs, of 
including items which are sometimes pheno- 
typically nonsensical (Travers, 1951), such 
as the Minnesota Multiphasic Personality 
Inventory item “I think Lincoln was greater 
than Washington,” which differentiates be- 
tween several dissimilar criterion groups 
and normals for no apparent reason. Jessor 
and Hammond (1957) have observed: 


If one is concerned only with the predictive va- 
lidity of a test, the matter of item content is 
relatively unimportant, for the empirical item- 
criterion correlations provide criteria for the final 
selection of items. However, when a test-developer 
insists that his purpose includes more than the 
prediction of a particular criterion performance 
and that the test items are intended to be indica- 
tors of a construct, then item content becomes 
highly important, and item-criterion correlations 
only are insufficient... . Therefore, test items 
which are intended to indicate a construct should 
be selected by rational (rather than intuitive) 
means [p. 164]. 


The basic difficulty with the rational ap- 
proach, however, is that items which are 
empirically nonvalid for the construct they 
are supposed to measure, because they do 
not correlate with behavior-relevant varia- 
bles (American Psychological Association, 
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1954), are often retained in a test solely 
on the basis of their content or "face" va- 
lidity, as in the case of the Bernreuter 
Personality Inventory (Landis & Katz, 
1934; Landis, Zubin, & Katz, 1935). An- 
other problem in rational test development 
is the confusion which has arisen over what 
is meant by a "construct" (Bechtoldt, 
1959). The term has been used in at least 
two quite different ways: first, to refer to 
an internal state or condition of the organ- 
ism, such as anxiety, which may be inferred 
from test responses and which may be 
thought of as more or less hypothetical in 
nature; and, second, to denote a pattern 
of interrelationships among a set of varia- 
bles which is differentiable empirically from 
other patterns or variables, as in factorial 
and similar types of multivariate analysis 
(Cronbach & Meehl, 1955). The first mean- 
ing of construct is exemplified by the 
mental processes which the Competence 
and Attitude tests were designed to meas- 
ure, and the second meaning is illustrated 
by the predicted moderate positive correla- 
tions among the dimensions of vocational 
maturity. In other words, choice compe- 
tencies and attitudes are seen as hypo- 
thetical variables which are inferable from 
verbal test behavior, whereas the construct 
of vocational maturity can be viewed as a 
behavioral syndrome defined by the em- 
pirical relationships among the variables 
which comprise it. If these distinctions are 
made, and standardzation and validation 
data are gathered accordingly, it would 
seem that the most reasonable approach to 
test construction would be one which in- 
corporates the merits of both the empirical 
and rational methods and avoids their 
shortcomings. Such a strategy was followed 
as much as possible in the design of the 
Attitude test. 

To establish the rational or "logical va- 
lidity" (Jessor & Hammond, 1957) of the 
Attitude test the general model for writing 
items proposed by Flanagan (1951) was 
followed, with some minor modifications. 
This model outlines three steps in item 
construction: 

1. Description of the behavior (the defi- 
nition, delimitation, and illustration of the 
variety and scope of the actions included 


in the items). The behavioral descriptions 
and definitions which were used in writing Î 
items for the Attitude test are listed im 
Appendix A. Considering the possible uni- 
verse of choice attitudes, these particular 
dimensions appear to be a fairly repre. 
sentative sample, having been selected from 
various statements of vocational develop. 
ment theory and inferred from relevant re- 
search findings (Ginzberg et al, 1951; 
Small, 1953; Super, 1957). 

2. Analysis of the behavior (the classifi- 
cation of a specific behavior or item with 
respect to other behaviors and hypotheses 
about its generality and predictability). For 
each dimension in Appendix A from 10 to 
25 items were written on the assumption , 
that the behavior stated in the item would 
mature with increasing age (or grade). In 
other words, the research hypothesis for 
each item was that it would be consistently 
and systematically (monotonically) related 
to age (or grade). The items were written 
so as to maximize their relationships to agë 
(or grade) and to minimize their associa 
tion with other variables, such as sex differ 
ences, socioeconomic status, and urban- 
rural residence. The goal was to devist 
items which were as generally applicable a 
possible. 

3. Formulation of item specifications (de 
cisions about the type of item content 
response format appropriate to measure 
the specified behaviors). Two variables wert 
experimentally manipulated in order to de- 
termine their effects upon the power of the 
items to differentiate between age (0 
grade) levels. First, some question has been 
raised whether self-report items should be 
written in the first or third person singulat, 
the argument being that the latter may be 
more subtle and hence more valid (Guilfo 
& Zimmerman, 1949). Consequently, some 
items (“Sometimes I wish I never had to 
work”) were written in the more persona 
grammatical form, and others (“Work * 
drudgery”) were stated in the more 107 
personal form. Second, the response forma 
of the Attitude test was also varied. п 
one version, a 5-point Likert rating 804“ 
was used to indicate degree of item е 
dorsement, whereas in the other only di 


Response Format 


Total100 items 


02. Experimental design for variation of item 
ype and response format of the Attitude test. 


us "true-false" options were pre- 


he experimental design for the Attitude 
summarized in Figure 2, which shows 
various combinations of item type and 
se format. In accordance with this 
each test booklet was divided into 
(1 and 2), with the items in the 
being stated in the third person 
and those in the second half in the 
person singular. From a pool of ap- 
ely 1,000 items, 50 items were se- 
| for each part of the booklet, making 
ıl of 100 items for the initial stand- 
tion. This set of items was admin- 
in two different forms (I and II), 
‘ith lowing instructions: 


low are a number of statements about 
nal choice and work. Read each state- 
d indicate on the separate answer 

ht to which you agree with the statement. 
їп the appropriate column on the answer 
whether you “strongly disagree,” “disagree,” 
disagree nor agree,” “agree,” ог “strongly 
with the statement. 


mostly disagree 


eet. If you disagree or 
in the column 


e statement, place a mark 
“F” on the answer sheet. 


International Business Machine 
and Measurement Research Center 
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(MRC) answer sheets, which have provi 
sions for either sealed or true-false re- 
have been used in administering 
the Attitude test, but the MRC answer 
sheet can be punched directly on to cards 
or reproduced upon magnetic tape and con- 
sequently has been considered to be prel- 
erable for large sample data processing 


STANDARDIZATION PROCEDURE FOR THE 
ArTTITUDE Trst 


Data Collection 


To evaluate the “empirical validity” of the At- 
titude test, at least in so far as its items might be 
related to age (or grade), the experimental design 
in Figure 2 wax replicated with male and female 

jects at each of the age and grade levels shown 

in Fi 3. Grade a» well as age was used as a 
criterion of vocational development, since it may 
be that the significant changes which occur in the 
maturation of vocational behavior are more closely 
associated with the impact of the educational sys- 
tem upon the individual than the mere passage of 
time. Not only is the school a major agent of so- 
cialization, but it also institutionalizes the de- 
velopmental tasks, such as choice of a life's work, 
which society expects the individual to accomplish 
at certain designated points in time 

& Bachrach, 1957). The 
at which vocational developmental 


"pi 


of the Attitude test were written so 

the fifth and sixth grades could 

them. That this objective 
achieved is indicated by a reading dif- 
1 for the inventory of 5.9595 in grade 
Dale and Chall (1948) 
for predicting readability. This value is 
considerably by the use of the word “oc- 
the items, which is not in- 
list of familiar words for 
the formula is computed 
without “occupation,” the value is 5.1700, which 
is only slightly above the fourth-grade level. Since 
occupation was defined for the subjects in the 
lower grades, the smaller Dale-Chall value would 
seem to better represent the reading difficulty of 
the Attitude test. 

The test was administered on a group basis 
during the 1961-62 academic year in selected 
schools of the Cedar Rapids, Iowa, elementary and 
system. This city was chosen as the 
base-line community for standardizing the Atti- 
tude test not only because of practical considera- 
tions, such as the cooperation and interest of the 
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Age 11-5 Il-6 12-6 13-6 
to to to 
12-5 13-5 14-5 
Grade 5 6 7 8 


1 
ORE SAMPLE $ 


14-6 15-6 16-6 217-6 
to to to 

15-5 16-5 17-5 
9 10 и 12 


Fi. 3. Sampling design for standardization of the Attitude test. 


school counselors and officials? "but also because 
Cedar Rapids, with a population of approximately 
92,000 in the 1960 census, has а fairly diversified 
economy and representative social structure. There 
are both large and small commercial and industrial 
concerns in the area, and the schools draw stu- 
dents from all socioeconomic levels. In selecting 
schools for the initial administration of the Atti- 
tude test, an effort was made to represent as 
closely as possible the various high and low rent 
districts of the city. At the elementary and junior 
high school levels, this objective was more satis- 
factorily achieved than it was at the senior high 
school level. Five elementary and two junior high 
schools were chosen by school officials, on the 
basis of previous studies and their personal knowl- 
edge, as representative of the system as a whole 
at these levels, but only one of the two senior high 
schools was selected for the first testing, due to 
problems of scheduling and time. As a conse- 
quence, the senior high school data may be some- 
what biased, but this will not be known until they 
can be compared with results from the other high 
school, which was subsequently tested in 1963. 


® Appreciation is expressed to the following 
counselors and officials of the Cedar Rapids Com- 
munity School District, without whose coopera- 
tion the Vocational Development Project would 
not have been possible: I. J. Semler, Director of 
Research, Cedar Rapids Community School Dis- 
trict; B. R. Gearheart, Director, Department of 
Special Services, Cedar Rapids Community School 
District; G. Novak and D. Wegner, Counselors, 
Washington Senior High School; R. Lamansky, 
Helen Masha, and H. Myron, Counselors, Jefferson 
Senior High School; Eileen Davis, D. Page, H. 
Roloff, and R. Shaffer, Counselors, Franklin Junior 
High School; M. Berg, R. O. Fitzsimmons, L. 
D. Hahn, D. B. Lindsay, T. Moran, K. Moreland, 
W. L. Paxson, P. Solar, G. Thompson, P. A. Tracy, 
and W. S. Van Deest, Principals, Cedar Rapids 
Schools. 


Also, it should be noted that data were collected ; 
in the senior high school in the fall, whereas they 
were obtained in the other schools during the 
spring. 


Sampling Design 


The plan for sampling the subjects within the 
schools where the testing was conducted is dia- 
gramed in Figure 3. At each age and grade level, 
the limits of which were essentially the same, à 
cross-sectional sample was designated for testing 
at any given point in time. Usually, this sample 
consisted of all students who were in attendance 
on the day the Attitude test was administered. The 
core samples, which will be followed up and tested 
from year to year, are comprised of those students 
who continue in the school system or who can be 
contacted after they drop out or graduate. On 
successive years, the cross-sectional samples will 
most likely change, due to students transferring 
in and out of the system, but the core samples 
will remain the same, Data on the latter will not 
only be used to standardize the Attitude test lon- 
gitudinally but also to identify possible develop- 
mental stages in choice attitudes and to determine 
whether the test prediets vocational adjustment 
after occupational entry. 1 

The Ns for the 1961-62 testing are given W 
Table 1. For some of the age intervals, notably 
the youngest and the oldest, the Ns are smaller 
than would have been desirable, particularly whet 
the sexes are considered separately. Much the 
same can be said about the lowest and highest 
grade levels, although their Ns are somewhat 
larger than those for age. The Ns for Form 1 for 
both the age and grade breakdowns are smalle! 
than the Ns for Form II, since not as many SU 
jects were needed to compute stable means from 
the 5-point rating scales of Form I as were re- 
quired for stable percentages based upon the true. 
false responses to Form II. In general, the ; 
were adequate for the analyses which were made 
for the initial standardization of the Attitude test 
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TABLE 1 


NUMBER or MALE AND FEMALE SUBJECTS STRATIFIED BY AGE AND GRADE IN THE STANDARDIZATION 
SampLES FOR Forms I AND П or THE ATrrrUDE TEST 


(1961-62) 
E Form I А Form П 

Male Female "Total Male Female Total 

$115 30 25 55 59 78 137 

11-6 to 12-5 40 24 04 94 80 174 

12-6 to 13-5 20 24 44 296 312 608 

13-6 to 14-5 14 9 23 324 304 628 

14-6 to 15-5 31 51 82 408 413 821 

15-6 to 16-5 37 51 88 116 113 229 

16-6 to 17-5 47 56 103 83 70 153 

217-6 26 30 56 45 27 72 

Total 515 2822 
Grade 

5 37 30 67 88 95 183 

6 38 21 59 91, 65 156 

7 22 22 44 329 336 665 

8 15 11 26 3 318 289 607 

9 10 17 21 362 345 707 

10 33 54 87 79 137 216 

11 40 59 99 79 60 139 

12 58 68 126 80 69 149 

Total 535 2822 


Note.—In some instances, the Ns for certain analyses were slightly different than those given in this 


table, since some subjects, for 


I Ns for age and grade are not exactly the same, whereas those 


one reason or another, were unable to complete their tests. Also, the Form 


for Form II are, because of differences 


in key punching errors in processing the answer sheets. 


but it should be cross-standardized on additional 
samples to increase the Ns at certain age and 
grade levels; and plans have been made to do this 
in the near future (Crites, 1964). 


Analysis of Data 


The analysis of the age and grade data was ac- 

| complished in three steps. First, the responses to 
each form of the Attitude test were analyzed sep- 
arately by sex and then for the total group in a 
»simple randomized design (Lindquist, 1953), 
| where the between-subjects factor was either age 
or grade. For Form I numerical values of 1-5 were 

| assigned to the points on the rating scale, so that 
1 indicated “strongly disagree” and 5 corresponded 
to “strongly agree”; for Form II numerical values 
of 1 and 2 were given to true and false, respec- 
tively. Items were retained for further statistical 
evaluation, if their F values from the analysis of 
variance reached or exceeded the 01 level. Second, 
for these items separate ¢ tests between adjacent 
age or grade means were made to determine 
whether they increased or decreased monotoni- 
cally. If an item mean for a given age or grade 
level was significantly greater or less than the 
Other means at the 05 level, then the item was 
Tejected as not meeting this criterion for a meas- 
| Ше of a developmental variable. Rank-order cor- 
ytelations (rho) were also computed between the 


item means and age/grade to gain an impression 


of how strong the relationships were. Finally, the 
items which survived this process of elimination 
were scored with a key based upon the mean re- 
sponses of twelfth graders, and average vocational 
maturity scores were computed for each grade. 
Also, the percentages of overlap in the score dis- 
tributions of the grades were determined to esti- 
mate the discriminating power of the total voca- 
tional maturity score. In addition to these main 
analyses, certain supplementary ones were Con- 
ducted which employed standard statistical meth- 
ods, such as chi-square tests and product-moment 
correlations. 


RESULTS AND CONCLUSIONS 


The data from the age and grade analy- 
ses, and a comparison of these, for response 
format and item type are presented first 
on the total sample. The findings on sex and 
school differences, which were negligible, 
are then reported. Finally, the results ob- 
tained from analyses of the total Vocational 
Maturity (VM) scale for the Attitude test 
are summarized and discussed. 


Age Analyses 
Response Format. The means and stand- 
ard deviations for the 100 items in Form I 
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TABLE 2 
TaLuES or Form I ATTITUDE Test ITEMS ror MALES ann? 


AGE Levets 11-5 THROUGH 17-6 


(1961-62) 


FEMALES COMBINED AT 


Means, STANDARD DEVIATIONS, AND F V 


Total 


4.22" 
2.02* 
2.16* 
2.18* 
.90 
1.45 
.65 
1.82 
1.34 
1.04 
5.91" 

76 | 
64 
57 
.85 
42 
1.36 
16 
Lil 

Eii 

at 

** 

** 

+ 

t 

ж 

ж 

** 

ж 


5р 


.968| 1.90 


1.10 
.934| 2.49* , 
1.00 
.931| 3.42" 
1.24 | 6.70** 


1.15 


1.22 
1.17 
.940| 3.14" 


1.13 | 4.289 
1.03 |15.07** 
1.00 | 1.64 
999 

.990/19.01** 
.998| 3.21% 
1.08 28.74" 
1.01 | 9.87" 
1.03 | 2.36* 
1.09 | 4.71" 
1.10 | 2.60* 
.911| 3.98" 
.950| 3.07" 
.931| 2.10* 


1.02 
1.92 
1.05 
1.21 
917 
1.10 
926 
846 
.942 
979 
.987 
1.01 | 3.83 


1.07 


M 


A 
7 


217 


96} 4.26 
.85| 2.60 
.83| 3.63 
.95| 3.40 
.95| 3.18 


7 


.74| .89| 1.88 
.25| .91| 2.29 


.934.12| .95| 4.21 
17/1.08/3.03/1.03| 3.21 
7| .97/3.42| .89| 3.55 

.9112.20| .85| 2.34 


.20| .94/4.1 
4|2.70/1.18/2.81]1.05| 2 
3/2.58/1.21/|2.36]1.14| 2.50 

-55/1 .14|2.65|1.11| 2.71 


6-5|16-6 to 17-5 


SD|M|SD|M|SD 


M 


.86| .92/3.90] .91/3.99| .82| 3.89 
.05/1.22/2.94/1.22/2.74]1.06| 3.09 


SD 


.98|3.42|1.00)3.41(1.0213.26/1.05) 3.40 


.15/3.43/1.05|3.30/1.09/3.33/1.03| 3.36 
.92/1.74/1.02]1.52| .77/1.64| .96| 1.66 


2.74/1.18/2.78/1.24/|2.81|1.24|2.65]1.10| 2 
4.13| .88/3.95|1.13/3.88|1.02/3.70]1.10| 3.93 


Age 


.68|3.76| .75/3.76| .72/3.73| .79|3.69| .89 3.73 
:88|1.73| .82/1.76| .85|1.79| .9011.93| .83| 1.84 


2.46|1.22/2.50]1.17|2.57|1.24/2.651.17/2.62,1.21| 2.61 


.744.19| .73|4.20| .86/4.19) .96/4.21 


.15/3.92| .87|3.65| .82]3.53| .923.4 
.28/3.35|1.17|2.87]1.08/2.74/1.09|2.62/1.09/2.76| .98| 2.81 


.85|3.65| .83/3.31 


.7811.2113.381.21/2.69|1.01/2.681.02/2.53/1.00/2.45/1.04| 2.66 


2 


11-6 to 12-5/12-6 to 13-5|13-6 to 14-5|14-6 to 15-5| 15-6 to 1 


7|1.35/4.48} .544.58| .57|4.39| .9014.16| .994 


2.75| .82 


sus 


| 


Item 


M|sD|M|SD|M|SD|M|SD| M 


.33/2.58/1.42/2.22/1.02/3.00/1.00 
.31/3.95/1.10/3.98/1.05/4.15| .66 


.16/2.71/1.30/2.39|1. 13 
.06,4.23| .824.46 
.3113.51|1.19/3.39]1. 15/3.35|1. 11/3. 23/1.06/3.23|1. 10/3. 


t 


1.22/3.45/1.04/3.13| .99]3.23| .93/3.07/1.02(3.10| .98/3.11/1.03/2.841.04| 3.08 
.67/1.22:2.68/1.31/2.33|1.14/2.81|1.24/2.49/1.03/2.45|1.05/2.40| .992.55/1.00| 2.49 
.1911.25/2.14/1.24|2.15|1.14]1.85| .72|2.16| .95/2.05| .92/2.02| .96)2.02] .93| 2.07 
3.06/1.18|3.42/1.20:3.74|1.07/3.96| .81/3.90/1.01/3.67/1.09/3.75/1.06/3.64/1.08| 3.69 


33 |2.96/1.28/2.75|1.25|2.52]1.19/2.69|1.14|2.43/1.05/2.43|1.09/2.41,1.08/2.40| .97| 2.48 


1.10/2.66/1.27/2.91/1.14/|2.77/1.052.25| .90/2.2211.01/2.21 
34 |2.33|1.02/2.09/1.24/1.80| .90|1.81 


.13/3.25/1.14/3.20|1.17|3.46/1.12]3.43|1 
.98/1.94]1.1811.70| ,88/1.88| .75]1.61 


.034.1 
10 |3.54| .90/3.60/1.11]3.41 


1 
1 
1 
1 
1 
1 
1 


.78|1.17|2.82| .96|2.87| .97|8.08| .68|2.89| .90]2.87| .96/3.00| .94/2.87| .87| 2.91 


.31/1.18/3.32| .99/3.37/1.09/3.38| .84/3.39| .97/3.49| .92|3.38| .983.41 


2.96 
2.50 
3.80 
3.15) 
M. 
3.15) 
4.17 
3 
1.81 


-06 
94 


2 
3 
3 
2 
2 
2 


ч сч со TN O OO 


42 |8.11|1.30(2.89/1.3412.63|1.31)2.38|1.21/2.38|1.17(2.58/1.13(2.69|1.09|2.5411.00/ 2.61 


43 |3.061.15/3.25|1.16/3.09/1.04|3.15/1.233.05/1.053.24| .95|3. 11/1 .03|3. 11 
50 |3.31/1.3033.51/1.02:3.93| .89/3.77| .93|3.84| .99|3.82| .87|8.701.02|8.87| .93| 3.75 


51 |2.80)1.24)2.97/1.12)2.58/1.20|2.92) .963.4011.27/2.96/1.29/3.351.20/3.37]1.12| 3.10 
52 3.42| .713.29| .84/2.94/1 .29/3.12/1.01/3.74| .87/3.36/1.11/3.441.03/3.63| .93| 3.41 


48 |2.94| .932.83/1.00/2.46|1.21/2.42 1.04/2.04| .8912.04| .85|2.09| .94/2.12| .86) 2.19 
49 |2.35)1.14)2.02|1.07|1.67| .811.77| .8911.42| .69/1.57| .93|1.58| .94]1.55| .78| 1.62 


46 |3.061.243.60/1.113.87/1.12/3.88| .80/3.78/1.04/3.68|1.05/3.671.073.64|1.14| 8.66 
47 |2.501.49/2.11/1.05/1.89/1.09/2.00|1.07/1.88| .89/1.90| .89|1.97| .88/1.88| .72| 1.96 


20 |2.15/1.13|2.00]1.07/1.87]1.23|2.08| .96]1.90| .93/1.99/1.00/1.93| .991.84| .91| 1.94 
21 |2.81/]1.04|2.54/1.16/2.43| .99/2.35]1.14]1.86| .98|1.74) .87]1.70| .86/1.63| .87| 1.89 
22 |2.13/1.12/2.46/1.23/2.11]1.00]2.15|1.23]1.87| .79/2.04| .96|1.96| .992.17| .96| 2.04 
23 |3.24]1.12/3.45/1.10,3.80| .77|3.42| .93|2.53| .99/2.46/1.02.2.33| .982.20 

35 |1.8011.08/2.00|1.21/11.67| .9811.62| .88|1.69| .94]1.60| .80]1.77| .9511.98| .92) 1.75 
36 |3.13/1.25/3.20/1.04/3.63| .87/3.65| .92/3.29| .90|3.20| .92/3.24) .85/3.22| .91 3.26 
37 |3.19/1.09/3.25/1.12,3.57/1.04/3.62| .88/3.73| .89/3.65| .92/3.73| .90|3.59| .92| 3.62 
38 |2.1711.052.75/1.11/2.17| .89|2.69| .91/2.34| .87/2.28} .92/2.30| .86/2.19) .81 2.32 
39 |2.94/1.06/2.7411.06/2.74/1.24 2.85/1.03|2.78/1.08/2.80/1.06/2.81)1.112.711.15| 2.79 
40 |2.63]1.22(2.751.152.481.06|2.35|1.11/2.54| .89/2.58| .98/2.66) .96/2.50| .89) 2.59 
Al |3.761.07/3.60/1.17/3.83/1.09/4.35| .48/3.88| .80/3.08| .94/3.74] .98|3.54| .98| 3.74 
44 |3.13| .88|3.58| .933.17| .82/3.58| .88|3.46| .893.35| .97/3.34| .97]3.30| .92 3.36 
45 |3.63/1.14/3.69/1.07,3.76/1.15/3.92/1.03,4.08| .80/3.95| .90/3.98| .91/3.89| .87 3.93 


19 |2.48|1.13|2.51/1.07/2.15|1.04|2.69|1.1012.24| .952.27| .97|2.24/1 .03|2 
24 |3.46|1.12|3.60|1.16/3.61) .87/3.77| .85|3.62| .85|3.69) .93/3.58| .91/3.69 
25 |3.73/1.24|3.69/1.12/3.52/1.02/3.81 


15 |3.31/1.12/3.15|1.14/3.17|1.13|2.92| .96/2.58/1.142.6711.222 
18 |2.94|1.47|2.46/1.352.13|1.10|2.15|1.13/1.77 .8911.74| .89]1.74) .901 


16 |3.5411.1213.48|1.15/4.00 


13 |2.74|1.43|3.14]1.23/2.93|1.22/2.92]1.0012.74/1.03 2.771.1 
17 3.17]1.21/3.15/1.34]3.04]1 


14 |2.35|1.16/2.75|1.27/2.48]1.33|2.58|1.042.4211.21/2.50]1.2: 


11 |3.54/1.21]3.58|1.12/3.78| .93|4.15| .80/4.03| .828 
12 |3.57|1.27|3.48|1.30/3.76|1.25|3.50|1.34/3.11/1.203 


26 
27 
28 
29 
30 
31 
32 
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Table 2—Continued 
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TABLE 3 | 
MEANS, STANDARD DEVIATIONS, AND F VaLues or Form П ATTITUDE Test ITEMS FoR MALES АХр Ё 
FEMALES COMBINED AT Ace LEVELS 11-5 THROUGH 17-6 


(1961-62) 
Age 
4 M - Total 
Item | 511-5 11-6 to 12-5112-6 to 13-5136 to 14-5146 to 15-5|15-6 to 16-5|16-6 to 17-5 2174 

| Mf sp | м | sp | м Sp | м | SbD| м |SD| M |SD| М |SD| М |SD M sp | F 
1 11.34) .47]1.32| .41/1.25| .43/1.32) .49]1.36| .4811.40) .49|1.51| .50/1.53| .56| 1.34 | .483 110.74" 
2 11.40) .49]1.43 491.59 4911.692! .491.60| .49/1.51| .51/1.52) .501.56| .61| 1.57 501 | 7.06" 
3 |1.14| .35/1.16) .37]1.14| .3511.15) .38/1.12| .32]1.15| .36]1.14| .34]1.10| .31| 1.14 | .349 | .80 
4 |1.39| .49/11.59| .50]1.64| .48|1.63| .49]1.58| .49]1.55| .52/1.50| .50|1.53| .56| 1.58 | .498 | 7.01" 
5 |110| .3011.06| .2311.07| .3011.06| .28/1.03| .17]1.04| .19/1.04| .19| .03| .17| 1.05 | .241 | 2.41* 
6 11.99) .4111.95| .431.29| .45]1.33| .48|1.32| .47]1.36| .4811.37| .48/1.31| .46| 1.31 | .466 | 2.61* 
7 |1.10| 301 07| .281.07| .311.06| .261.03| .17]1.04| .19]1.02| .15]1.05| .22| 1.05 | .241 | 3.11** 
s 11.20) 401.221 .42]1.98| .46/11.29| .471.22| .411.23| .43]1.20| .40| .18| .38| 1.24 | .437 | 2.92" 
9 |1.83|.3711.91| .29]1.94| ,24]1.94| .24]1.97| .17]1.96| .20/1.98| .15/2.00| .35| 1.95 | .232 | 8.55%" 
10 |1.96| .44]1.18| .40]1.21| .41]1.21| .421.22| .41]1.21| .40]1.21| .41]1.25| .43| 1.21 | .415 | .5l 
11 11.13| .34]1.17| .39]1.13| .391.10| .32]1.07| .26/1.10] .30]1.08| .26]1.05| .22| 1.10 | .319 | 3.47** 
12 |1.19| .39[1.26| .44/1.37| .51/1.37| .50/1.45) .50/1.47| .501.45| .50/1.54) .52) 1.40 | .498 10.00" 
13 11.47 .50/1.41| .491.49| .50/1.50| .51/1.54) .50/1.52| .52/1.54| .50|1.60| .55| 1.52 | .507 | 2.92** 
14 |1.44| .5011.53| .51/1.60) .501 54| .50]1.56| .50/1.55| .50|1.57| .50/1.53| .50| 1.55 | .501 | 2.17* 
15 |1.32| .481.38| .51]1.47| .51]1.57| .50]1.66| .47]1.60, .53|1.66| .47/1.05| .48| 1.57 | .504 |18.82* 
16 |1.14| .3511.13| .36/1.11] .35]1.10| .33]1.10| .30]1.13| .371.15| .35|1.13| .33| 1.11 | .333 | 1.20 
17 |1.48| .501.50| .50/1.59| .51/1.63| .49]1.63| .481.68| .511.57| .49/1.55| .52| 1.61 | .498 | 4.39" 
18 |1.60| .49|1.68| .47|1.79| .41|1.84| .37]1.89| .31|1.87| .331.94| .23/1.95| .22| 1.84 | .369 22.86" 
19 |1.61| .49]1.58| .491.72| .46|1.74| .44]1.76| .43|1.70| .50]1.76| .43/1.86| .37| 1.73 | .452 | 6.36" 
20 |1.73| .44]1.77| .42]1.90| .31]1.90| .30/1.93| .25/1.89| .31]1.92| .26]1.90| .32| 1.90 | .307 13.44". 
21 |1.45| .50]1.56| .51/1.76| .431.81| .40|1.88| .33/1.88| .32]1.93| .25]1.93| .44| 1.81 | .402 |39.22" 
22 |1.63| .48]1.66| .48|1.81| .40]1.82| .38/1.85| .36]1.82| .38|1.80| .351.75| .40| 1.81 | .394 |10.45" 
23 |1.38| .49]1.29| .451.25| .44]1.33| .48|1.54| .50]1.63| .48|1.75| .43]1.79| .48| 1.44 | .503 |58.51" 
24 [1.22| .41]1.24| .421.16| .37]1.78| .40/1.50| .361.16| .37]1.11| .32]1.10| .34| 1.16 | .374 | 2.58* 
25 |1.09| .291.12| .51/1.08| .29/1.09| .33/1.10| .3011.09| .29/1.08| .26/1.09| .29| 1.09 | .302 | .54 
26 |1.57| .501.51| .481.50| .52/1.50) .51/1.52| .50]1.54| .52/1.52| .50|1.47| .52| 1.51 | .511 | -66 
әт |1.20| .47]1.96| .4511.28| .46|1.26| .45|1.20| .40]1.19| .39]1.21| .40]1.24| .43| 1.24 | .432 | 2.58* 
28 |1.41| .49/1.51| .42]1.57| .50[1.66| .49|1.76| .43/1.74| .44|1.82] .391.80| .40| 1.67 | .474 23.99" 
29 |1.23| .421.30| .33]1.34| .48|1.34] .481.34| .471.38| .491.35| .481.48| .52| 1.34 | .479 | 2.69%" 
30 |1.52| .50]1.51, .50]1.63| .49/1.65| .49]1.63| .49/1.63| .48|1.67| .47]1.68| .49| 1.63 | .488 | 3.07“ 
31 |1.75| .43]1.75| .44]1.83| .38]1.87| .35]1.88| .331.83| .371.89| .31/1.92| .37| 1.85 | .362 | 5.50" 
32 |1.97| .45]1.10| .50]1.18| .39]1.17| .381.52| .361.14| .35|1.18| .38/1.19| .46| 1.17 | .382 | 2.12" 
33 [1.51| .51]1.57| .471.66| .48|1.72| .47[1.75| .44]1.08| .49]1.75| .43|1.74] .40| 1.70 | .469 | 8.06” 
34 [1.70 .461.78| .50/1.90| .30]1.91| .291.94| .24]1.91| .29]1.95| .22)1.96| .41| 1.91 | .302 16.92" ” 
35 |1.83| .37| .81| .41]1.94| .24]1.93, .25]1.95| .22/1.93] .35/1.94| .24]1.94| .43| 1.93 | .277 | 8.21" 
36 |1.15| .371.23| .4211.21| .43|1.18] .401.27| .45]1.34| .4911.90| .441.30| .48| 1.24 | .438 | 7.04" 
37 [1.26| .44]1.98| .46/1.16/ .37|1.15] .37|1.10] .31/1.13| .341.12| .32]1.17| .37| 1.15 | .362 | 7.75" 
38 |1.69| .46|1.77| .49/1.87| .351.91| .311.91| .29]1.89| .34/1.93) .2611.92| .37| 1.89 | .332 |13.02*" 
39 |1.51| .50]1.52| .50[1.58| .50]1.62| .50]1.62| .49/1.59| .49/1.58| .49| .66| .59| 1.60 | .498 | 2.11" 
40 |1.62| .491.66| .491.73| .47]1.72| .45[1.68| .47|1.74] .44|1.70| .46|1.67| .47| 1.70 | .465 | 1.90 
41 [1.21| .41]1.17| .37]1.10| .32]1.09| .32]1.09| .29| .11| .32]1.12| .33]1.18| .38| 1.11 | .326 | 4.40" 
42 |1.33| .47|1.47| .50]1.68  .48|1.68| .48|1.66| .48|1.59| .51[1.57, .50/1.56/ .50| 1.63 | .492 |14.03", 
43 |1.42| .511.30| .46/1.47| .51]1.37| .49/1.39| .49/1.38| .501.31| .461.28| .45| 1.39 | .494 | 4.88* 
44 1.97| .46/1.22| .411.24| .44]1.26) .461.20| .44/1.25| .44]1.30| .46]1.93| .42| 1.26 | .445 | .74 
45 |1.19| .39]1.18| .381.10| .32/1.10| .32]1.08| .27/1.12| .32|1.10| .30]1.09| .33| 1.10 | .315 | 3.95" 
46 |1.32| .50/1.30| .461.25| .45|1.20| .41]1.17| .38]1.19| .39|1.15| .36| .15| .43| 1.21 | .417 | 5.41" 
47 1.74 .47]1.78| .4111.89| .32]1.93| .29]1.94| .23]1.91| .29]1.95| .22]1.94| .28| 1.91 | .302 |14.34"" 
48 |1.43| .51]1.47| .501.07| .47|1.70| .46/1.77| .43|1.70, .41|1.84| 371.89 .40| 1.71 | .465 22.49" 
49 |1.79| .43]1.82| .39]1.91| .301.95| .23/1.96| .21]1.92| .281.95| .21]1.97, .39| 1.93 | .274 12.007. 
50 11.23| .43/1.18| .401.17| .401.17, .42/1.13] .34]1.09| .30/11.13| .34]1.10 .31| 1.15 | .375 | 3-40" 
51 |1.43| .511.50| .501.48| .501.52, .53|1.48) 501.42) .49/1.39| .49]11.29| .45| 1.47 | .508 | 3.17" 
52 |1.98| .461.28| .45|1.30| .40]1.34| .481.32| .48/1.30| .46/1.25| .43/1.33| .47| 1.31 | .470 | 1.21 


Measurement OF VOCATIONAL MATURITY 15 
Table 3—Continued 
| Age | 
— z ; - —| Total 
Item | 511-5 [11-6 to 12-5/12-6 to 13-5/13-6 to 14-5 14-6 to 15-5156 to 16-5)16-6 to 17-5) 217-6 
E mm | | | | E 
la | 5р | м sD | м |50. м | دی‎ | м | دک‎ | м | می‎ м |SD|M s| м | sp | F 
ا | س‎ ER BOTE) EE —| | -|— - 
53 |1.55| .511.60| .49]1.61| .4911.56) .511.56| .531.55 501.51 .501.50| .50| 1.57 | .502 | 1.17 
54 |1.71| .49/1.81| .39]1.89| .31/1.89| .32/1.88 -32/1 .83| .38/1.88| .32/1.93| .25| 1.87 | .341 | 6.84** 
55 |1.64| .49]1.70| 461 79| .41/]1.83| .39/1.83| 871 78| 411.86 ‚351.92! .28| 1.80 .401 | 7.20** 
56 |1.31| .48/1.39| .5011.47| .50/1.52| .52/1.55| .50/1.59| .50/1.69 4611.60) .49| 1.52 | .510 | 9.58** 
57 1.54| .51]1.54| .50)1.64) .48]1.67| .47]1.67 .4711.57| .4911.68) .47/1.65| .48| 1.65 | .482 | 3.97** 
58 (1.50) .51/1.56| .50/1.70 .4811.77| .44/1.76| .43/1.71| .45/1.80| .40/1.78 .42| 1.72 | .459 |11.60** 
59 |1.22 .43/1.15| .35/1.16) .42/1.16 .41/1.09| .321.15| .39/1.08| .28/11.06, .23 1.14 | .374 | 4.26°* 
60 |1.24| .44/1.21| .42|1.28| .48/1.29 471.27 „4511.33 .521.29 .45/1.18] .38| 1.27 | .464 | 1.7 
61 |1.13| .36/1.17| .39/1.20| .45]1.21) .43/1.21| .43/1.26| .50/1.26| .44/1.31) .46 1.21 .437 | 2.17* 
62 |1.53| .53]1.60| .49/1.65| .50]1.58| .52]1.51| .501.61 .51/1.46| .50/1.50| .50| 1.57 | .509 | 5.72** 
63 11.50. .51/1.49| .50/1.68| .48/1.58| .53)1.61 .4911.62| .49/1.61| .49/1.63| .48| 1.60 | .500 | 4.24** 
64 |1.60| .50/1.54| .50/1.67| .49/1.71 .47]1.76| .49]1.78| .43|1.84| .36/1.74| .44| 1.71 | .460 | 9.51** 
65 |1.50| .51]1.53| .51]1.63| .49/1.66] .49/1.69. .4611.66| .48|1.67| .47/1.78| .42 1.65 | .485 | 5.21** 
66 |1.61| .51/1.69| .48/1.79| .41]1.81| .411.79 .41]1.77| .42]1.86| .35]1.94| .23| 1.7 .418 | 7.39** 
67 |1.47| .5311.52| .50/1.64| .51|1.66| .4911.65 „481.66* .50/1.69) .461.61 .49 1.63 | .495 | 4.33** 
68 1.33! .5011.30| .46]1.36| .50/1.35| .50/1.38| .49]1.48| .51]1.37| .481.32| .47| 1.37 | .495 | 2.80** 
69 |1.36| .51/1.40| .49]1.45| .53/1.50| .52/1.43| .501.42 ‚501.49 .5011.54| .50| 1.45 | .512 | 2.54* 
70 |1.18| .42]1.20| .40/1.19| .42]1.15| .40]1.11| .31/1.12 .98|11.14| .3411.22] .42| 1.15 | .381 | 3.39** 
71 |1.58| .52]1.61| .50]1.67| .48/1.65| .49]1.61| .491.61 .5011.63| .48|1.68| .47| 1.63 | .490 | 1.30 
72 |1.71| .49/]1.76| .43]1.80| .41]1.86| .36/1.83) .37]1.79 .42]1.88| .33|1.88| .33| 1.82 | .391 | 4.13** 
73 |1.47| .53]1.61| .501.67| .47/1.65| .48/1.68 .A711.67| .48]1.56| .50/1.61| .49) 1.65 | .483 4.36** 
74 |1.09| .34|1.13| .34|1.10| .35/1.09| .32]1.07 .9711.13| .36/1.10| .30/1.04| .20| 1.09 | .316 1.47 
75 |1.36| .51]1.34| .47]1.42| .50/1.40| .491.40 .4911.30| .471.42) .49,11.43| .50| 1.39 | .493 1.83 
76 |1.36| .51]1.41| .49]1.50| .51/1.51| .521.49 .50/1.41| .50/1.50| .50/1.51) .50| 1.48 .507 | 2.61* 
тт 1.53 .531.54| .50/1.62| .50/1.65| .49/1.69 .46/11.69| .47|1.75| .43|1.61| .49| 1.65 | .486 5.18** 
78 11.26! .47/1.19| .40/1.24) .46|1.24| .461.80 .4611.31| .481.37| .48/1.26| .44| 1.27 | .460 3.66** 
79 [1.53| .54/1.57| .51/1.08| .48[1.70| .47|1.65 .4811.62| .49|1.66| .47/1.69| .46 1.06 | .484 3.40** 
80 |1.40| .53/1.49| .501.46| .52/1.42| .50|1.47 „5011.43! .50/1.31| .46/1.32| .46| 1.44 .505 | 2.93** 
81 [1.51| .54/1.60| .50/1.71| .47/1.68| .481.69 .4611.73| .45/1.62| .49/1.50| .50| 1.08 | .479 5.13% 
82 1.37| .531.46| .51|1.51| .51/1.57| .511.60 .4911.50| .51/11.59| .491.68| .47| 1.55 .506 | 6.00** 
83 |1.38| .53/1.53| .50/1.53| .51|1.57| .511.55 .50/11.60| .50/1.59| .49/1.53| .50| 1.55 .508 | 2.98** 
84 |1.36| .52/1.49| .50]1.50| .50/1.51) .521.56 .5011.52| .51/1.64) .481.57| .50| 1.53 .508 | 4.24** 
85 |1.43) .55/1.51| .511.56| .51/1.59| .50/1.63 .4911.56| .51/1.58| .491.67| .47| 1.58 .503 | 3.75** 
86 |1.42| .55|1.60| .50|1.69| .47/1.67| .471.60 .5011.55| .51/1.67| .47/1.69| .46 1.63 .493 | 7.07** 
87 [1.40] 561 46| .53]1.57, 5211.52] .501.53| .50]1.61| .521.50) .501.47 .50| 1.53 | .514 | 3.41** 
88 |1.53| .55|1.60| .50/1.72| .45/1.79| .411.78 .4211.75| .44/1.84| .36/1.83) .87 1.75 | .442 |11.28** 
во 11791 551.69| 491.89| .34]1.87| .30 1.91| .28/1.86| .361.88| -32/1.92/ .28 1.86 | .361 19.3075 
90 120| 47123| 4311.29) 48/1.39| .48|.41| .50/1.42| 511.49 .49]1.46| .50 1.35 | .492 | 8.50*" 
91 148| 55151| .52]1.49| .521.56| .51|L.53| .50].49| .52/1.54| 501.53 .50| 1.52 | .513 | 1.74 
оз [74 `БОп тт 4511.88) .33]1.88| .34].88| .33/1.83| -39|.84| 371.92 .28 1.86 | .358 | 5.047. 
оз 153 `571 gel 53/170, 471.69] -48|L.71 .45]1.09 .481.77| .421.85| .36 1.09 | .475 | 5.417. 
94 Loe] san зт 510.44 | 52 1.47| .51|L.50, .50].45| .541.56 .501.53 .50) 1.46 .516 | 5.08%" 
95 153 S81 65| 511.70 -4311.791 .42]1.77, .42|1.75| .45|L.77| .421.78| .42 1.75 АТ | 7.0088 
96 1.85 `561 89| 811.53) .52]1.58| .511-66| .48/1.00 511.79] .411.81| 40 1.50 | .509 17.0475 
от [1 66 7591.78 471.73. 4911.80 .40|1.85| .36(1.81) „411.87 .341.86 -35| 1.80 | .420 | 7.57% 
ов 1:22 ‘sult эз| 491.41 .50]L.49, 5201.30] 501.4 511.44) 501.46 .50| 1.40 | .507 | 3.807 
99 1421 Bal. бв soi ТӨ 441.78 .41/1.83 281.80] .45|1.87) „341.04 .23) 1.78 | .430 16.947 
100 160 `561 | 451.90] .30]1.99| 271.92 271.80 .37[1.83| .38]1.86| .35) 1.89 | .333 |13.24%* 
AP > +05 
p< .01 


for the statement “Why worry about choos- 
ing an occupation when you don’t have 
anything to say about it anyway,” which 
elicited strong disagreement, to 4.62 for 


the item “It’s unwise to choose an occupa- 
tion until you have given it a lot of 
thought,” which aroused strong agreement. 
Of the 100 analyses of variance which were 
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conducted, 16 yielded F values significant 
at the .05 level, and 41 produced F values 
significant at the .01 level. Not all of these 
differences between age levels formed curves 
which were monotonic, however, as plots 
of the item means made apparent. There 
were notable reversals in the trends over 
age of some items, as is indicated in Table 
2 by both increases and decreases in the 
means of items like the third one, which 
are higher in the middle of the age range 
than at the extremes. 

Comparable data on item means, stand- 
ard deviations, and analyses of variance for 
Form II are presented in Table 3. The 
median item mean was 1.57, and the median 
standard deviation was .465. For the total 
sample, the lowest item mean was 1.05, as 
in Form I, for the statement “Why worry 
about choosing an occupation when you 
don’t have anything to say about it any- 
way,” indicating that 95% of the subjects 
answered it false, and the highest item mean 
was 1.95 for the concept “Plans which are 
indefinite now will become much clearer in 
the future,” indicating that 5% endorsed this 
item as true. The analyses of variance for 
the 100 items of Form II yielded 10 F val- 
ues significant at the .05 level and 76 
significant at the .01 level, for a total of 
86 item means which differed reliably be- 
tween age levels. Of these, however, only 
36 were monotonic functions of age, as es- 
tablished by t tests for differences between 
adjacent item means. 

To determine whether Form I or Form 
II was more effective in differentiating be- 
tween age levels they were compared on 
the total number of F ratios which they 
yielded at the .05 and .01 levels. The data 
were cast into a 2 X 2 contingency table 
for two independent samples, where one 
criterion of classification was “Form I or 
Form II" and the other was “significant or 
not significant". With one degree of free- 
dom, the resulting chi-square value of 19.24, 
corrected for continuity (Siegel, 1956), was 
significant at the .001 level. An inspectional 
analysis of the discrepancies between the 
observed and expected frequencies in the 
cells of the contingency table showed that 
Form II produced a greater number of sig- 
nificant Fs than Form I. 


Crrres 


Item Type. The same kind of chi-square 
test was also used to compare the differen- 
tiating power of the first and third person 
item types, but only for Form II, since it 
had been shown to be superior to Form I in 
the previous analysis. The contingency table 
which was set up had “first or third person” 
as one factor and “significant or not signifi- 
cant” as the other, the cell frequencies be- 
ing the total number of items for each item 
type which was monotonically related to 
age, as demonstrated by the F and t tests. 
Since the cell entries for the first and third 
person items were exactly the same, 18 sig- 
nificant and 32 not significant, the value 
of the chi square was, of course, zero and 
nonsignificant. 

Conclusions. There are three major con- з 
clusions which can be drawn from the re- 
sults of the age analyses. First, irrespective | 
of variations in response format and item 
type, it is evident that responses to certain 
verbal statements of vocational attitudes 
and concepts, which are theoretically rele- 
vant to the choice of an occupation, are 
monotonieally related to age during the 
adolescent years. Second, the true-false item ' 
format for such verbal statements appears 
to be preferable to item scaling in differenti- 
ating between subjects grouped according to | 

| 
| 


~~ 


age intervals which range from = 11-5 to 
> 17-6. And, third, there are no statistically 
demonstrable differences in the numbers of 
consistently age-related items which are 
written in the first and third person singu- 
lar. 


Grade Analyses 


Response Format. Table 4 lists the Form 
I item means and standard deviations fot 
Grades 5 through 12, as well as the F ratios 
from the analyses of variance, with the 05 
and .01 probability levels noted for those 
which were significant. (The descriptive 
statistics—means and standard deviations 
—for the total sample are the same as those 
given in Table 2 for the age analyses.) I? 
all, there were 10 Fs which were significant 
at the .05 level and 51 at the .01 level. AS 
in the age analyses, some of the trends 2 
item means from one grade to another We 
not monotonic functions of grade and con 
sequently these items failed to meet this 
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MEASUREMENT or VOCATIONAL MATURITY 


TABLE 4 
anp F VanvEs or Form I Arrirepe Test Ireus FOR MALES AND 


MEANS, STANDARD DEVIATIONS, 


JOMBINED IN Grapes 5 THROUGH 12 


FEMALES € 


(1961-62) 


: ; ». $3 К ЖОЕ Ж. $ 252214 РЕ: 11111511 
|. |Ё®аабавлввив вв ЗН ВЗ азйвянвеыввкввзчгаваявавапаяин яана 
© ~ со EPET ET- Ssang JM Ке ЕЕЕ HN 1 c6 07 і сс Os н со Со 

ri БУ E a e 2 ч о > ¬ т со со yt ч BTSs a 
1 |ь|252552828 Бавно SESS ES ES EE ASSES SSS SSSR SLAS ASSES SSSSAS 
= 1 | ر کا چ “و‎ еле ee ie AN E c i = ae ч чч 
| .|SHSERRRSESREESCHEZSRISISURISISSESSIERORRSTOUSSÉSTRTS 
OS E a5 Ci «6 03 «6 03 31 9 об 0S сё CV сї сө C8 сї rel сї сч сө 69 сч од еч ea c3 C8169 Cd e P 69.09 сі OES Oe? ео ео со єч са FLED 00 0 
FEERLEEEEEEELEEEEEEEEEEEEEEEEEEEEEEEEEEDELEEE EB E BEL EET 

a [l eme mi E ЖЫЙ а AS ees ex NS Ge TOME LATET E 7 E 
METILLELEEEEEEEEEUEREEFEEEFEEEEEREEEBEEEEBREREERERBE RE 
ороз зб оз чё өз i e ч єз сї сї сї сё сї сч сї з ч єз сї сб сд сї сө сї сї сї mi еб сї єз ed еб сч єч сї сэ сї сэ сочи со e сз с 5 
g|S8EABSEZSSERZRRSTDSSES EFEEEEEEEBBEEPEELEEEEEEELELELRE 
= ыы Ab retur wap uo E xe e е ыч >; ae Se ee Ep RE Benn җы ENTE 
FEEEFEEELEEEEEEEFREREEEFEEEEEEBEEEREBEEREEEERBEEFREEBEE 
Ci оў GL өз Î e i eû cO об сісісі ө сї ч сї Ci і єї сї б о сч еб Ci có сї сї сд Cd i i еб co Ci Ci сч со сї со со ео со c OI SM 
[SABBRESRSSURSROREBSRSSRSESSSSÓSESKHRSSÓSBSLRSRSSRHSA 
ad pape t= Rh TEM DEPRES TE RE Hl Ke cC LEM el ME Gub Bog RE LIEN аы КЕ xi : 59 ха 
4|9$889328EB8SBSE UOS CEEEEEEEEEEEEBEEEEEEEEEEBEEELELDA IR 
сї C» تپ فج لھ‎ có تپ‎ ооб сї сс со СЧ NMA i i Û Û e өз сї оё сї có Cd сї өз сї ч со об сї сї сї сз сї еб е? оў c9 і сї сї об сб c8 
METFLELEEELLEEEEELEEEEELEEEELEEELEEEEEELEE ELE EE LE 
foal Deke реч кА Sats; SE CEE Eo oe y xy Ey 
CEREEFLFEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEBEBEEERE LP. 
= E- c. © сө сө бо об «wt со сә 09 00 ч! ч! 00 О 09 S a UM a E E IN ri ci R со. Oe E e E 
1 cid ciu өз ч ез icd осі сї сч ез cd i сї єч сї сї сї CÓ ез сб еб сч co сї Cd Mi AM ті об со сч оо єч со са со сэ ск = ся з сс 
Š PEFEEEEEEBEEEEEEEEEEEEULELEELELEBE EEEEPEEELEPEDELELELL 

ioi uou س ینب‎ ee ЗААГ. = an E 

MELLCLELELEEEFEEEEEEEREBEEEEEBEEEEEEEERBEREREE EE EIE 
e$ cb نھ‎ са نھ‎ об e i eû i eû сї сч сї об có Cd сч сї сї сї оў сз сб сї оў сї со Cd i co C сї i ed со сї сї сї «f сч со со со со Cd Cd n DES 
NIKEEEFEFEFEEEEEEEEEEEEEEEEEEEEEEEEBEEEELEECEEEREBELEE, 
ڪا‎ е A E TE EDD Еа roe] لبن‎ ee ee ча 
MEECFIELELEEEEEDEFEEEEEEEEEEEBEEEEERIE ЕЕРЕЕ ЕЕЕ 
сї ез cd - оў Le реў оз ез сї сї có сб еб сї сї ч сї сї CÓ оў еб сз еб Cd сб сї чї еб сї +з т 08 co сч сї сч со сї со со со оз =ч сч се ео Са en 
MIEEEEEEEEEEEEEEEELEEEELEEEEEEECEEEEDEBEEEBE EE EEE LEE B 
" m dx ot Sek phen eee S E Spee ee eee rice ОЧЕ EE EE o NN 
х | 38888 FEEEEEEEREEEEEEEEEECEDEEEEEEEEEEEEEBEREEREEEE: 
o e c i O i e eû оос о eû ê OF Î сї с} ef еб ed ed сі об сї ed сї сї OF сї ii ез ез сї сї сї 69 сї ed сб ез еб сї сї zie сї OS 
a BARS SHEAR SR eR RNR RRR SRSA BR SRA SARIBH ARS 
Ls Em ciu RM ی ہے ےک ی‎ MM MMM NN NE PEE na ei ai Pe rd. 274 
i | c КЕКЕ XEKEREEIJGGEREBEREEEEEEFEBEEEEFEEFEEE 
= 4 к ЕЕ ДЕЧ ЕЕЕ ЫЫЫ РЯБОВА & 5 ©з е +# ©г- © =ч © о cis] dn e 
ы | l o pb ERI об OF OLN OF оі оз сб с) CH OS об 65 OF ON оў бі сї OI cd оз CH Od сі сў ў cd Od оў с5 AL сїсо CI 8 
mow Snes Sn Aets Sh RARAARARANRRSARSS BSSRAGSTIFTFSIRISSS 


Item 


18 Јонх О. Creires 


Grade 
— = - Total 
Item 5 | 6 7 | 8 9 10 il 12 
M SD | “M | SD | М |3 | M sp | M|SD|M SD | M SD | M | SD M SD | F 

53 |3.1: 1.03/2.42]1.22:2.74| .97)2 11.09]2.80|1.21/2.73/1.11] 2.80 | 1.13 | 2.80** 
54 2. 1.23/1.60/1.06/1.90| „922 31.24/1.96/1.112.02/1.08| 2.09 | 1.16 | 3.38** 
55 |2.78 39/1.20/1.96/1.06/2.33| .77|2 4/1.04/2.14/1.13|1.96| .90| 2.27 | 1.10 | 5.90** 
56 3.2811. 1.19/2.82/1.37|3. 111.102 3/1.07|2.57/1.26/2.41]1.12| 2.73 | 1.24 | 5.48** 
57 2.93]1.37/3.03/1.12/2.32]1.202.33| .8212. 38/1.14/2.39/1.20/2.50]1.04| 2.61 | 1.19 | 3.00** 
58 |2.8711.21/3.03|1.19/2.12|1.19/2.44|1.29/2.45|1.28|2.31]1.12]1.98|1.13|2.101.09| 2.34 | 1.21 | 7.35** 
59 |3.37/1.24]3.81/1.10/3.62]1.25/3.93|1.12]4.55| .673.89| .9414.03| .97/3.93| .94| 3.86 | 1.08 | 4.66** 
60 3.3911.043.34/1.08/3.46/1.143.56/1.0713.83/1.213.631.113.51/1.073.641.03| 3.54 | 1.09 | 1.07 
61 |3.61/1.25/3.83/1.143.068|1.22/3.74|1.0044.10| .76,3.70]1.14/3.48/1.23/3.43/1.20| 3.63 | 1.19 | 1.84 
62 |9 751.102.81| .932.64/1.29/2.90| .79/3.24]1.01/3.01]1.02/3.00/1.22/3.13| .99| 2.96 | 1.09 | 1.96 
63 |2.76/1.2912.90]1.13/2.00]1.23|2.44| .92/2.93]1.26/2.60| .98|2.87|1.08)2.711.13| 2.73 | 1.13 | 1.05 
64 |2.93]1.15/2.61/1.32/2.88/1.53/2. 15|1.04|2.34|1.27|2.281.35|2.421.312.13|1.17| 2.43 | 1.30 | 3.79** 
65 |9.88|1.139.88| .982.26/1.29/3.07|1.12/2.45/1.30(2.57 1.04|2.35/1.00/2.57/1.00| 2.60 | 1.11 | 3.41** А 
66 |3.01]1.24/2.78/1.002.42]1.23/2.56| .57)2.72|1.11/2.28| .892.25| .99|2.22| .92| 2.45 | 1.05 | 5.01** 
67 |3.04/1.20]3.14| .98/2.64]1.00/2.89| .83/2.93/1.05/2.67/1.052.08|1.14/2.631.08| 2.78 | 1.09 | 2.25* 
68 |3.40/1.20]3.22| .90/3.56]1.00/3.19| .72/3.14| .943.24| .86|3.26] .87/3.33| .86] 3.31 | .938| .99 
69 |3.30]1.28/3.31|1.15/2.94]1.17/3.22| .96/3.55/1.19/2.87| .87]3.01/1.11/3.06/1.03| 3.10 | 1.10 | 2.17* 
70 |3.82| .91/3.88| .99/3.40]1.343.70| .763.90| .84]3.77| .94]3.68| .99/3.76| .80| 3.74 | .983 1.29 
71 |2.85]1.20]3.29/1.04/2.78]1.24/3.00| .72/3.14/11.04]2.90| .792.93| .88/2.83| .83| 2.93 | .970 1.84 
72 |2.78|1.17|2.86) .98/2.42/1.152.59| .78|2.41| .6212.48| .74|2.51| .82(2.42| .77| 2.55 | .905 2.38* 
73 |2.94]1.332.66/1.34/2.30]1.10/2.52/1.03/2.38/1.32/2.47,1.152.581.29/2.01]1.21| 2.58 | 1.25 | 1.43 
74 |3.72| .91]3.90| .843.76 1.123.89 .63/3.86| .97 4.16 .71]3.90| .80/3.94| .86| 3.93 | .863 1.82 

75 (3.10/1.093.15| .922.82/1.073.04 .693.17 1.083.14 .89/3.35| .923.281.00 3.18 | .981 1.76 | 
76 |3.29| .96/2.86/1.03|2.74]1.20|2.67| .86/2.62| .93]3.11/1.12/3.07/1.09|3.00]1.08| 2.99 | 1.07 | 2.06* 
тт |3.1811.04/2.92| .94|2.98]1.21/2.59|1. 10/2. 41/1. 132.52|1.03]2.73]1.15|2.49]1.03| 2.72 | 1.10 | 4.05** 
78 |3.63/1.06|3.54/1.33/3.94|1. 12/3.33/1.31/3.24/1.38]3. 39|1.23/3.69]1.21|3.44/1.21] 3.54 | 1.23 | 1.68 
79 |2.84|1.06|2.59|1.04|2.00|1.02/2.96|1.23/2.79|1.35|2.74|1. 22/2. 47|1.23/2.56]1.21) 2.59 | 1.20 | 3.05** 
80 |3.22| .97/3.31]1.05/2.94|1.41/3.07|1.12|3.24|1. 36/3. 11|1. 13/2. 29|1.06)3.35/1.09] 3.22 | 1.13 | .95 | 
81 2.9711.05/2.97/1.15]2.58/1.132.63| .99|2.97|1.27/2.60|1. 15|2. 76|1.12/3.02|1.05) 2.83 | 1.12 | 1.92 1 
82 3.19/11.162.93| .94|2.48|1.14|2.89|1.03]2.69| .99/2.90/1.07|2.60/1.152.52) .98| 2.75 | 1.09 | 3.80** | 
83 |3.09/1.36/2.97/1.21/2.58/1.23,3.00/1.19/2.90]1.27/3.00/1.17,2.81]1.14]2.75|1.13| 2.87 | 1.21 | 1.18 
84 |3.07/1.233.17]1.28 2.42/1.30]3.04/1.32/2.601.27/2.93/1.32 2.67/1.272.7011.22| 2.82 | 1.29 | 2.36* 
85 92.87/1.30,3.051.112.401.28/3.00/1.22/2.83/1.292.91]1.31/2.45/1.24/2.47/1.19| 2.69 | 1.27 | 2.97** - 
86 2.91/1.10/2.69/1.092.34/1.19/2.78|1.13/3.28]1.342.75/1.162.67/1.252.37]1.00| 2.65 | 1.17 | 3.45 4 
87 |2.97\1.09]2.80| .92/3.30/1.42/2.63/1.02/2.76/1.16,3.01]1.30/3.14/1.323.10].21| 3.02 | 1.22 | 1.48 | 
88 |2.87/1.00/2.80]1.15/2.10/1.25|2.50| .992.21/1.21/2.36/1.05]2.251.072.11]1.02| 2.38 | 1.12 | 4.80" | 
89 |2.58[1.39(2.461.28[1.681.14|2.33| .86/1.79| .61]1.80]1.01]1.75| .76/1.94| .81 2.02 | 1.05 | 7.32? | 
90 |3.63/11.10/3.36| .90/3.40]1.41]3.41| .95/3.48| .93/3.30]1.063.33/1.04]3.06| .94| 3.32 | 1.06 | 2.13* 
91 3.13]1.17/2.97/1.09/2.441.39/3.00/1.19/2.62/1.19/2.87]1.08|2.62]1.052.67]1.07, 2.78 | 1.15 | 2.58" 
92 (2.311.162.341.10]1.64| .84/2.19| .981.90| .99|2.17|1.01]1.99| .97/1.98] .89| 2.07 | 1.01 | 3.07" 
93 |2.99|1.24]2.88|1.18|1.96|1.26|2.78/1 . 20/2. 59/1. 22/2. 46|1 09/2. 141.012. 11| .96| 2.41 | 1.16 | 7.87" 
94 |3.98/1.083.39| 992.50 1.46/2.96/1.20,3.38/1.27/3.09/1.00/3.00|1.15(3.02 1.11, 3.07 | 1.16 | 3.19 
95 |2.97]1.232.51| .08/1.86/1.39/2.30/1.15/2.38|1.03)2.47| .94|2.34| .992.30| .92| 2.41 | 1.09 | 5.00" 
96 |3.31]1.112.85|1.10/2.58/1.54/2.44|1.29 2.55/1.00/2.44| .94|2.27| .98/2.37| .88| 2.57 | 1.11 7,1058 
97 2.51 .97}2.37) .972.30]1.332.001.022.21| .922.21 .91/2.02| .952.21 .93 2.23 | 1.00 | 1.80 | 
98 |3.19/1.253.05/1.202.78|1.47 2.63 /1.47|3.071.11/3.02/1.21]3.18|1.16/3.04|1.02. 3.04 | 1.21 | 1.18 
99 2.73/11.242.75/1.19]1.96/1.31/2.37/1.42/2.10]1.06(2.24/1.05]1.90| .811.97| .84 2.21 | 1.10 Tort] 
100 (2.211.292.0711.221.60/1.22/1.63/1.02/1.55| .72/1.97|1.11|1.68| .9111.77| .93| 1.84 | 1.08 | 2.91" - 
*р < .05. 

а Ol: 

criterion for the measurement of a develop- 5, along with the corresponding F value 


mental variable. | from the analyses of variance, of which 4 
The Form II item means and standard were significant at the .05 level and 82 at 
deviations by grade are presented in Table the .01 level. Further comparisons of adja- 
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TABLE 5 
Means, STANDARD DEVIATIONS, AND F VALUES OF ATTITUDE Test (Foru II) Irems ror MALES AND 
FEMALES IN Grapes 5 THROUGH 12 


(1961-62) 
Grade 
———— Total 
Item 5 6 1 8 | 9 10 u 12 
M | SD| M | SD м | sp x | so| м jsp | u |so| м jsp j u jsp | м | sp | Р 

AH | Б |с ee iei | apo 2 za 
IS «4911.311 .49]1.32| .48/1.52| .50]1.45| .50/1.54| .53| 1.34 | .483 |9.70** 
2 1. "401.641 .49/1.59| 4911.55, .50]1.48| 511.59 .55 1.57 | .501 8.88** 
3 1.1 73511.15| 3711.12] .32]1.12| .33]1.15| .30/1.13| .34| 1.14 | .349 | .7 
4 1.4 48/1.62| .49]1.50| .501.03| .48|1.48| .53]1.53| .51| 1.58 | .498 | 7.78** 
5 1. 730/1.06| .28/1.03| .16]].04| .19]1.03| .17]1.02| .15| 1.05 | .241 | 3.45** 
6 |l. 74511.33| .48/1.34| 471.31] .461.37| .48]1.32| .47| 1.31 | .466 | 1.90 
е. 8111.05) .24/1.02| .15|1.04) .191.03| .16/1.04| .19| 1.05 | .241 | 4.13** 
cie 471 28| 4711.21) .41/1.21| .40]1.21| .41)1.21) .41| 1.24 | .437 | 3.14** 
9 1.8 “9611.96 211.98) .1511.95/ .22]1.98| «1412.00| .25| 1.95 | .232 |12.55** 
10 .42]1.21| .421.23 121.08 39/1.19| .40]1.22| .41| 1.21 | .415 | .98 
11 “9811.101 .311.08| .27|1.06) .241.08| .27|1.06) .25| 1.10 | .319 | 6.55** 
12 `501.38| .491.43| .49]1.53| .501.50| .50/1.52| .51| 1.40 | .498 |13.33** 
13 5011.57| .511.53| .50/1.50| .50]1.54| .53/1.05| .51| 1.52 | -507 | 5.55** 
14 | "B111.54| .50]1.56| .50/1.56| .501.52| .501.58| .49| 1.55 | .501 | 1.84 
15 1511.55| .501.67| .47]1.64| .48|1.05| .54|1.69| .46) 1.57 | .504 |20.00** 
16 "a6/1.08| .30/1.08| .27]1.15| .36]1.15| .401.12) .33| 1.11 | .333 | 5.52** 
17 151112651 .481.62| .48|1.08| .47|1.61) .55|1.56| .51| 1.61 | .498 | 5.35** 
18 "M1.83| .38/1.80| .31]1.94| .25/1.93| .25|1.96) .19| 1.84 | .369 |31.73** 
19 | 47175| 4411.75) .43]1.79| .41|1.75| .50|1.85) .37| 1.73 | .452 | 1.81** 
20 331192] .27/1.93| .261.94| .24/1.90| .3011.98] .23 1.90 | .307 118.00** 
21 45183| .39]1.87| .331.93| .26|1.93| .261.95| .33| 1.81 | .402 |49.50** 
22 4011 83| .3711.841 .37]1.89| .311.82| .38]1.84| .38| 1.81 | .394 |16.70** 
23 45135| 4811.53] .501.63| .48/1.70| .46]1.81| .43| 1.44 | .503 |59.88** 
24 ‘391 17| .39]1.14| .35]1.14| .35/1.14| .35]1.08| .28| 1.16 | .374 4.59** 
25 ‘2911.091 .3211.09| .29]1.11| .31/1.10) .30]1.05| .23| 1.09 | .302 | 1.49 
26 "591.51| .51[1.51| .5011.53| .50/1.50| .531.46| .51| 1.51 | .511 | 1.23 
27 "461 27| `461.17| .38]1.23, 4211.18, .39]1.21| .41 1.24 | .432 | 5.0077 
28 "8011.66| .49]1.75| .44]1.81| .40/1.83| .381.82| .38| 1.67 | .474 29.56** 
29 "471.33| 481.36 .48|1.36| .48|1.34| .47/1.44) .51| 1.34 | 479 2.44* 
30 1491.65| 4911.63) .49|1.66| .47|1.07| .47]1.73| .40| 1.63 | .488 5.51** 
31 "49187| .35|1.88| .331.87| .33]1.80| .351.04 -30| 1.85 | .362 6.98** 
32 891.17 .39]1.12| .331.19| .39]1.17| .37]1.19| .43| 1.17 | .382 3.59** 
33 "ABL 73| .40|1.76| „431.71 .45/1.78| .4511.73| .46 1.70 | .469 11.31** 
34 “galt 92| .271.98| 251.95 .21].95|.21]1.97| .31 1.91 | -302 24.32** 
35 26195| 2311.96) .20/1.94| .311.98) .301.87| .30 1.93 | .277 14.86** 
36 "43118| .40/1.25| .43|1.36] .481.85| .511.29) .45) 1.21 438 | 9.20** 
37 ‘381 14| .371.10| .31]l.11| .32].11| .31]1.13| .34 1.15 | .362 | 9.39** 
38 734/1.91| .31]1.90| .30|1.94) .241.93| .311.95| .29 1.89 | .332 |18.62** 
39 ‘S01 62) 491.64] .48]1.09| .491.57| .49 1.63) .55| 1.60 498 | 5.22** 
40 "47172| .45|1.68| 471.72, .4011.73| 4411.70) .46) 1.70 465 | 2.15* 
41 ‘3411.08 .30/1.09| .281.11| „321.11 .311.13| -34 1.11 | .326 | 5.62** 
42 “igi 67| .491.67| .47|1.67| .481.50| .531.55| .50| 1.63 492 |14.57** 
43 5011.38] .49|1.40| .49]1.39| .491.32 .501.31 -46| 1.39 .494 | 2.98** 
4 441 25| .46/1.20| .44|1.24| .43/1.29| .45]1.28) .45 1.96 | .445 | .95 
45 "asl1.09| .30|1.07| .20|1.08| .281.10| .301.11 -33 1.10 | .315 | 5.74** 
46 451 90| .41|1.16| .37]1.18| .391.22 .41/1.13) :38 1.21 417 | 5.76** 
4T "3311.93, .291.96| .20]1-95) „231.94  .241.95 .24 1.91 | .302 |23.10** 
48 "48172| .4011.75| .44[1.83| .381.82| 401.80) .36 1.71 | .465 |30.28** 
49 7301.94| .24|1.97| .16].94| .241.93| .261.99 27| 1.93 | .274 20.43** 
50 "ЧО 17| 4111.10) .301.15| .301.14| 361.09 .29 1.15 | .375 | 4.61** 
51 511 50l .52|1.48| .501.42| .491.37| 501.35 .48 1 47 | .508 | 3.80** 
52 46133| .491.33| -47|1.27| .451.25| .45]1.30| .46 1.31 | .470 | 1.02 
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Table 5—Continued 


Grade 
= == Total 
на NR 1 8 9 кы BET 2 

м |зро|м|зр| м |sD | м |зр| м |зр|м|зо| м|зр| м |sD| M SD Р 

53 |1.55| .511.61| .491.61| .49/1.53| .52/1.56| .501.54| .50/1.51 51/1.54| .50| 1.57 | .502 | 1.81 
54 |1.70| .48]1.81| .41/1.88| .32/1.90| .31|1.88| .32]1.88| .33|1.83| .401.92 .27| 1.87 | .341 | 9.08** 
55 [1.64] .491.70| .46/1.79| .41/1.83| .39/1.84| .36/1.80| .40]1.81| .41 1.91| .29| 1.80 | .401 | 9.17** 
56 |1.33| .48/1.40| .50/1.47| .50/1.51| .53/1.52| .50/1.67| .47]1.64 49|1.68| .47| 1.52 | .510 |11.69** 
57 11.52) .5111.53| .50]1.66| .48/1.68| .47|1.68) .47/1.61) .49]1.55| .51/1.69) .46 1.65 | .482 | 5.48** 
58 |1.49| .51/1.50| .50/1.71| .48/1.78| .44|1.76| .4311.76| .43|1.73 .47]1.81| .40| 1.72 | .459 |15.32** 
59 !1.21| .42/1.13| .34/]1.18| .43]1.14| .40|1.10| .32]1.07| .26/1.09| .31]1.09| .28| 1.14 | .374 | 4.76** 

60 |1.25| .46/1.25| .43/1.28| .48/1.30| .47|1.26| .46/1.30| .46/1.32| .48|1.23) .42 1.27 | .464 1.02 
61 |1.16| .39]1.15| .36/1.21| .45/1.22| .45|1.18| .41/1.27| .44|1.27| .46/1.30| .46| 1.21 | .437 3.16** 
62 |1.52| .52]1.00| .49/1.07| .49]1.57| .51|1.52| .51/1.58) .49]1.53| .51/1.48| .50| 1.57 | .509 | 5.90** 
63 |1.50| .5111.52| .50/1.65| .50/1.60| .51/1.60| .49/1.08| .47|1.56) .51]1.64| .48| 1.60 | .500 | 3.65** 
64 |1.53| .51/1.57| .49[1.66| .49/1.73| .45/1.78| .421.77| .42]1.80| .42/1.79| .41| 1.71 | .460 | 1.57** 
65 |1.48| .51/1.52) .51/1.64) .50]1.67| .47/1.70| .46/1.66| .48/1.65| .49/1.72) .45| 1.65 | .485 6.82** 
66 |1.61| .511.69| .48]1.79| .42/1.82] .404.77| .42/1.88| .32]1.79| .43]1.92| .27| 1.79 | .418 |10.81** 
67 |1.43| .52]1.58| .49/1.64| .511.67| .48|1.64| .491.67| .47|1.66] .49[1.64| .48| 1.63 | .495 | 5.81** 

68 |1.29| .48/1.34| .4711.36| .50/1.37| .50|1.38| .49/1.43| .49|1.41| .51/1.32| .47| 1.37 | .495 | 1.62 
69 11.39) .511.37| .481.46| .54/1.47| .51]1.41| .49|1.50| .50]1.46| .53/1.56) .50| 1.45 | .512 | 3.09** 
70 11.20) .43/1.18| .381.17| .42/1.17| .40|1.11| .321.14| .35|1.09| .33|1.21| .41| 1.15 | .381 | 3.42** 
71 |1.58| .52/1.63| .49]1.65| .49/1.66| .48]1.59| .49|1.70| .46]1.61| .52/1.68| .47| 1.63 | .490 | 2.50* 
12 |1.69| .49]1.75| .43]1.81| .41/1.84| .37/1.82| .39/1.89| .31/1.79] .44/1.91| .29| 1.82 | .391 | 6.08** 
73 |1.50| .52/1.60| .50/1.67| .47[1.66| .48)1.68| .47/1.64| .48/1.60] .52|1.60) .49| 1.65 | .483 | 3.75** 

74 |1.13| .36|1.12| .33]1.11| .36/1.08| .31|1.08| .27]1.07| .251.12| .37|1.07| .25| 1.09 | .316 | 1.64 

75 |1.36| .50/1.37| .48|1.41| .50]1.38| .49/1.40| .49]1.35| .48|1.31| .49|1.46| .50| 1.39 | .493 | 1.39 
76 |1.36| .50|1.41| .49|1.50| .511.51| .511.47) .50/1.45| .50|1.46| .53|1.52| .50 1.48 | .507 | 2.69** 
тт 11.59) .52|1.58| .49/1.60| .51]1.65| .47/1.08| .47|1.77| .42/1.65| .511.74| .44| 1.65 | .486 | 6.50** 
78 |1.24| .45|1.17| .41|1.25| .46]1.24| .45|1.29| .46/1.39| .49|1.34| .5011.32| .46| 1.27 | .460 | 4.83** 
79 |1.54| .53|1.55| .51|1.68| .48]1.68| .47/1.67| .47/1.60| .49|1.59| .52/1.68| .40| 1.66 | .484 | 4.31** 
80 |1.41| .52]1.48| .50|1.45| .52]1.44| .50]1.45| .50/1.43] .50/1.41) .52/1.30| .46| 1.44 | .505 | 2.02* 
81 |1.55| .53/1.57| .51/1.71| .47/1.70| .47]1.68| .47]1.80| .40]1.69| .49/1.55| .50| 1.68 | .479 | 7.00** 
82 |1.40| .52/1.45| .51]1.52| .511.56| .50]1.59| .49]1.53| .50]1.48| .53]1.69| .46| 1.55 | .506 | 6.36** 
83 |1.38| .52/1.54| .50]1.54| .50[1.56| .52|1.56] .50|1.57| .50/1.56| .52|1.57| .50| 1.55 | .508 | 3.12** 
84 |1.38| .52/1.45| .50]1.51| .5111.52| .51|1.57| .49|1.54| .50|1.56| .52/1.61| .49| 1.53 | .508 | 4.51** 
85 |1.49| .54]1.53| .511.56| .51/1.57| .50]1.63| .48/1.58| .49|1.57| .52/1.64| .48| 1.58 | .503 | 4.34** 
86 |1.43| .54/1.61| .50]1.70| .47/1.67| .48|1.57| .50/1.63| .48|1.55| .53/1.72| .45| 1.63 | .493 | 9.66** 
87 |1.41| .541.48| .54]1.55| .52/1.54| .501.53| .511.60| .49]1.59| .52]1.44| .50 1.53 | .514 | 3.45** 
88 |1.52| .54/1.60| .50]1.73| .45]1.77| .42/1.77, .421.86| .35|1.81| .43|1.85| .30 1.75 | .442 |14.45** 
89 |1.58| .5411.69| .49]1.80| .361.90| .33|1.92| .27|1.92] .28|1.87| .381.89| .31| 1.86 | .361 27.78" 
90 |1.20| .45|1.22| .43]1.30| .48|1.33| .49]1.42| .50]1.42| .49]1.47| .53|1.44| .50| 1.35 | .492 | 9.70** 
91 |1.42| .54/1.50| .54]1.51| .52/1.54| .5111.53| .501.54| .501.44| .52/1.53| .50| 1.52 | .513 | 1.70 
92 |1.72| .49/1.80| .45]1.86| .35/1.89| .33/1.89| .32/1.87| .34]1.82| .42]1.89| .32| 1.86 | .358 | 6 53** 
93 |1.48| .55/1.60| .54]1.70| .48/1.69| .46[1.71| .45]1.71| .45|1.71| .48|1.82| .39| 1.69 | .475 | 7 93** 
94 1.28) .52/1.36) .52]1.43| .52/1.48| .51/1.48| .51/1.60| .49]1.49| .53|1.58| .49| 1.46 | .516 | 8 018 
95 |1.50| .56/1.67| .521.77| .431.79| .41]1.78| .42]1.74| .44|1.70  .401.77| .49| 1.75 | .447 |10.95** 
96 |1.37| .551.38| .521.58| .52/]1.58| .50|1.64| .49/1.69| .46]1.70| .49|1.85| .35| 1.59 | .509 [19 ЫШЫ 
97 |1.66| .54|1.70| .51[1.74| .451.80| .401.84| .381.88| .33/1.86| .381.87| .34| 1.80 | .420 | 9 13% 
98 |1.27| .511.33| .51|1.42| .50/1.42) .52|1.30| .48|1.47| .50/1.39| .52/1.50| .50| 1.40 | .507 4.95** 
99 1.51 .5611.63| .52]1.74| .45/1.80| .401.84| .39]1.85| .36/1.80| .43|1.92| .27| 1.78 | .430 19.65** 
100 1.66) .55/1.81) .45/1.90| .311.92| .28|1.94| .251.88| .33|1.84| .41|1.84| .37| 1.89 | .333 18.64** 

*p < .05 
у): 

cent item means with individual ¢ tests in- To find out whether Form I or Form Ш 


dicated that 50 of the items which were sig- differentiated better across grade levels, the 
nificant at the .01 level were also monotonic same procedure was followed as was use 


functions of grade. in comparing them in the age analyses. 
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2 x 2 contingency table, in which the cell 


‚ entries were number of F ratios "significant 
and not significant" for the two response 


formats, yielded a chi square of 14.79, which 
with one degree of freedom was significant 
at the .001 level. An inspection of the dis- 
crepancies between the observed and ex- 
pected frequencies of Fs for the two forms 
made clear that Form II had greater differ- 
entiating power than Form I. 

Item Туре. The comparison of items 
written in the first and third person singu- 
lar was again made by means of a chi- 
square test. This time the data were not 
exactly the same for the two item types, 
as was the case in the age analysis, but 
they were similar enough that the differ- 
ences between them in the number of mono- 
tonic items each produced was not sta- 
tistically significant. The value of the chi 
square was 1.00, which with one degree of 
freedom reached only the .30 level. 

Conclusions. In general, the conclusions 
which can be drawn from the grade analyses 
are much the same as those which were 
supported by the age analyses. Responses 
to verbally stated vocational behaviors 
change systematically and consistently from 
grade to grade, instructions to answer items 
as either true or false give better discrimina- 
tio between grades than scaling instruetions, 
and item type has no reliable effect upon 
differentiation between grades. On the basis 
of these results, and those from the age 
analyses, it was decided to conduct any 
further analyses of items only on data ob- 
tained with Form II, with no distinction 
being made between items written in the 
first and third grammatical styles. 


Further Item Analyses 


Item Differentiation. Although most 
theories of vocational development (e; 
Ginzberg et al., 1951; Super, 1953, 1957) 
assume that age is the time dimension along 
which changes in vocational behavior oc- 
cur, it is quite possible, as mentioned previ- 
ously, that grade units may be equally, or 
even more, significant as the criteria O 
increments and stages in vocational ma- 
turity, since they correspond more closely 
to various aspects of development m 8e 
eral—educational, personal, and social. 


~ 


Chronological age may correlate more 
highly than grade with physical, and possi- 
bly intellectual, development but not neces- 
sarily with behaviors which are influenced 
more by learning than by heredity. Within 
a given grade, both younger and older stu- 
dents are expected to cope with the same 
developmental tasks and acquire the same 
types of capabilities, whether these involve 
problem-solving abilities or interpersonal 
competencies. Consequently, it would not 
be psychologically unreasonable to expect 
that item differentiation between grades 
might be equal to or even greater than that 
between age intervals. 

This hypothesis was not testable statisti- 
cally on the data of the present study, be- 
cause most of the items which differentiated 
between age and grade levels were the same, 
and consequently they could not be classi- 
fied into independent categories for purposes 
of comparison. A meaningful logical analy- 
sis of the age and grade differences in item 
differentiation can be made, however, by 
noting what the content was of those items 
which were related to one variable but not 
the other. The most clear-cut difference was 
for items which had to do with conceptions 
of the vocational choice process, such as 
“It’s probably just as easy to be successful 
in one occupation as it is in another” and 
“There is only one occupation for each 
individual.” Six times as many of these 
items were related to grade only as were 
associated with age only. Similarly, four 
times as many indecision items, such as 
«I really can't find any occupation that 
has much appeal to ше,” were independent 
functions of grade as compared with age. 
About twice as many work value items 
were contributed by grade alone. Why these 
particular item dimensions vary with grade 
but not with age may be due to the effects 
of а number of factors in addition to the 
impaet of the edueational system upon vo- 
cational development. But whatever the 
influence of other variables is, it would 
appear to be less significant than school ex- 
periences. Otherwise, age would have pro- 
duced greater item differentiation than 
grade. Consequently, grade was used as the 

jteri j selection. 
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which emerged from an inspection of the 
plots of Form II item means across grade 
levels was that most of them followed а 
curve from predominantly true responses 
in the elementary grades to predominantly 
false responses in the senior high grades. Of 
the 50 items in Form II which were mono- 
tonically related to grade, 43 were increas- 
ing functions from true to false, and only 7 
were decreasing functions from false to 
true. It might be argued from these trends 
that they indicate the operation of an ex- 
traneous response set were it not for two 
considerations. First, most of the items in 
the Attitude test are worded in such a way 
that a true response to them would be a less 
voeationally mature one. Illustrative items 
are: “A person can do anything he wants as 
long as he tries hard" and “I know very 
little about the requirements of occupa- 
tions.” The rationale for constructing items 
which primarily expressed vocationally im- 
mature attitudes, behaviors, and concepts, 
rather than mature ones, was that indis- 
criminate or generalized tendencies to en- 
dorse items as true would be counteracted 
and would not result in spuriously high vo- 
cational maturity scores. Second, some 
items were stated so that their content 
would be expected to elicit a true response 
as the more voeationally mature behavior, 
and it was among these that the items with 
decreasing functions of grade were found. 
Examples of such items are: “Choose an 
occupation, then plan to enter it” and “In 
making an occupational choice, you need 
to know what kind of person you are.” 
Thus, the differential response trends to the 
two types of item content, with the mature 
items serving as a control on the immature 
items, may indicate that the tendency to 
answer items true in the lower grades is 
less an effect of test response set than it is 
degree of vocational development. 

Another aspect of the trends in item 
means over grade levels was the extent 
to which the curves proceeded through dis- 
cernible stages and the points at which 
these stages occurred. By defining a stage 
as a significant difference (.01 level) be- 
tween adjacent item means for two grades, 
it was possible to classify the item trends 


into several different groups: first, out of the 
50 items in Form II which were monotoni- 
cally associated with grade, there were 10 
which exhibited continuous curves with no 
significant “breaks” between grades. That 
is, there were no stages, as defined, in the 
trends for these items. Second, there were 
20 items with one stage, 10 of which oc- 
curred between the sixth and seventh grades 
and 4 between the ninth and tenth grades. 
In other words, the steps in the educational 
ladder between elementary and junior high 
school and between junior high school and 
senior high school appear to be related to 
many of the stages which take place in voca- 
tional development. Third, there were 16 
two-stage items, of which 10 had breaks be- 
tween the sixth and seventh grades in con- 
junction with some other combination of dif- 
ferences between adjacent grades, 1 of these 
being between the ninth and tenth grades. 
Again, then, there is evidence of stages at the 
major transitional points in the educational 
system, particularly between the elementary 
and junior high school levels. Finally, 5 
items had as many as three stages, but they 


on 


did not conform to any particular pattern, ` 


other than that 3 of them had a stage 
between the sixth and seventh grades and 
2 between the ninth and tenth grades. Thus, 
the most provocative finding from the 
analysis of stages in the item trends was 
that a total of 30 out of 50 items had stages 
which corresponded to the basic divisions 
in the educational structure. 

A supplementary analysis which was per- 
formed on the trends in item responses was 
occasioned by the chance observation that 
there were four items which would have 
met the “monotonic” criterion had it not 
been for a reversal in the curves of the 
item means in the eleventh and twelfth 
grades. In other words, these two grades had 
means on the four items which were sig- 
nificantly less than the tenth grade—and in 
three instances they were lower than the 
seventh, eighth, and ninth grades. In fact, 
they were more like the fifth and sixth 
grades. Why? When the content of the 
items was examined, it became apparent 
that they expressed a common attitude 07 
feeling about making a vocational choice 
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and planning for the future. The items were 
the following: 

1. “I often wish that someone would just 
tell me what to do instead of having to 
choose an occupation by myself.” 

2. “You can wait to choose an occupation 
until after you have finished your school- 
ing.” 

3. “If you do the best you can now, the 
future will take care of itself.” 

4. “Sometimes I wish I never had to 

work." 
Each item refers to a desire to avoid the 
personal responsibility involved in select- 
ing and committing oneself to а course of 
action which will lead to eventual оссира- 
tional entry and a life of work. By re- 
sponding to these items more as younger 
children did, the eleventh and twelfth 
graders in the sample, who were on the 
threshold of leaving the familiar and secure 
environment of the school, belied their anx- 
iety and concern about venturing forth into 
a world which held unknown or uncertain 
prospects for them. In effect, what they 
did was to react to their apprehensions by 
regressing to modes of response which are 
more typical of earlier stages of vocational 
development. 

Deviation Items. An unexpected phe- 
nomenon which was discovered through an 
inspectional analysis of the graphs for the 
item means was that certain items, a group 
of 10 in all, had two characteristics in 
common: first, they were not related to 
either age or grade; and, second, at each 
age or grade level 20% or less of the sample 
endorsed the items as either true or false, 
depending upon the direction of the predom- 
inant response. In other words, these items 
did not meet one of the necessary conditions 
for the measurement of vocational maturity 
and they were answered in a particular way 
by only a very small segment of the total 
sample. As a result, they were named the 
deviation response, or D, items, and the 
hypothesis was formulated that they may 
measure a maladjustment factor, as Berg 
(1959) has proposed. Additional data on 
the Deviation scale and its relationship to 
vocational maturity are presented and dis- 
cussed on pages 24 and 26. 


Ser Differences? 


Males and females differ оп most non- 
intellective measures, as well as a few in- 
telleetive ones, and consequently it was 
expected that they would differ in their 
responses to the Attitude test, but not as 
little as they actually did. There were only 
four items which differentiated between age 
and grade differently for the sexes. The 
following two items were related to both 
age and grade for males but, not females: 
“There are so many factors to consider in 
choosing an occupation, it is hard to make 
a decision" and “If you have some doubts 
about what you want to do, ask your parents 
or friends for advice and suggestions." Con- 
versely, these two items differentiated be- 
tween age and grade for females but not 
males: “When it comes to choosing an oc- 
cupation, I'll make up my own mind” and 
“J want to continue my schooling, but I 
don't know what courses to take or which 
occupation to choose.” Whether the sex 
differences on these items are reliable ones 
which can be replicated will have to be de- 
termined in the cross-standardization of 
the Attitude test. About the most that can 
be concluded now is that the available data 
indicate only a few differences between 
males and females in the vocational atti- 
tudes and concepts which they endorse as 
self-descriptive. Evidently, sex is not a very 
significant factor in the maturation of these 
verbal aspects of vocational development. 


School Differences 


Since the Attitude test was administered 
in different schools at the elementary and 
junior high levels, it was important to de- 
termine whether there were any systematic 
variations from one school to another in 
item means over and above those attributa- 
ble to differences between grades. To make 
te ST 

7 Four 7-page tables giving the age and grade 
item means, standard deviations, and F values for 
males and females on Form II have been deposited 
with the American Documentation Institute. 
Order Document No. 8168 from ADI Auxiliary 
Photoduplication Service, Library of Congress, 
Washington, D. C. 20540. Remit in advance $2.00 
for microfilm or $3.75 for photocopies and make 
checks payable to: Chief, Photoduplication Serv- 
ice, Library of Congress. 
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this analysis schools were compared within 
grades in a simple randomized design (Lind- 
quist, 1953). For the four elementary 
schools in the sample, there were four items 
which were significantly different (.01 level) 
in the fifth grade and two in the sixth grade. 
Between the two junior high schools, there 
were two items which were statistically re- 
liable (.01 level) in the seventh grade, none 
in the eighth grade, and two in the ninth 
grade. Eight of these 10 differences were on 
only two items, however, and these were 
the following: "The best occupation is one 
which has interesting work" and “When I 
am trying to study, I often find myself day- 
dreaming about what it'll be like when I 
start working." At each grade level (ex- 
cept the eighth, where there were no differ- 
ences) the schools which had students from 
lower rent districts consistently had smaller 
mean values for these items. In other words, 
they tended to endorse these statements as 
false more often than the other schools. 
Whether this tendency means the items are 
related to socioeconomic background re- 
mains to be established in further research, 
but it сап be concluded that differences be- 
tween schools make only a negligible contri- 
bution to the variance of most of the items 
in the Attitude test. 


Total Score Analyses 


The VM and D Scales. Once the 50 items 
in Form II which were monotonically re- 
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lated to grade had been identified, they and 
the deviation items were scored for each 
subject in the sample to obtain total VM 
scores and D scores. The items and scoring 
keys for these two scales (Form III) are 
given in Appendix B. For the УМ scale it 
is noteworthy that, as mentioned previ- 
ously, most of the items are keyed in the 
false direction for a higher vocational ma- 
turity score, and consequently the effect of 
any acquiescence response set associated 
with indiscriminate true endorsement of 
items should be reduced. The distributions 
of VM scores were plotted on normal proba- 
bility paper for each grade and the total 
sample, and they were shown to be es- 
sentially normal, the greatest departures 
being at the eleventh- and twelfth-grade 
levels, as was expected from the way in 
which the items were selected. For the D 
scale the distributions were highly posi- 
tively skewed, since there were only a few 
subjects who answered many of these items 
in the keyed direction, which was directly 
opposite to that of the majority response. 
The means and standard deviations for the 
VM and D scales are summarized for each 
grade and the total sample in Table 6. The 
VM means increase from one grade level 
to another, with the possible exception of 
the eleventh grade where there is a slight 
leveling off, and the standard deviations 
decrease slightly in the upper grades. The 
D means are somewhat larger in the ele- 


TABLE 6 


MEANS AND STANDARD DEVIATIONS or Tora VM AND D Scores ғов MALES AND FEMALES COMBINED 
IN GRADES 5 THROUGH 12 


(1961-62) 
Scale 
Grade VM D 
ШАУ TA dom. LL Ж 

M SD M SD 

5 (N — 188) 26.86 5.88 1.78 1.40 
6 (N = 150) 29.26 5.74 1.70 1.43 
7 (N — 657) 33.25 5.65 1.21 1.18 
8 (N = 601) 35.07 5.44 rotat 1.10 
9 (N = 703) 36.50 4.82 .98 .91 
10 (N = 213) 37.81 4.58 EI 1.02 
11 (N = 131) 37.16 4.72 1.06 1.08 
12 (N = 143) 39.00 4.00 1.02 1.03 
Total (№ = 2786) 34.64 6.03 1.18 1.13 
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Fic. 4. Cumulative percentage ogives of total VM scores for Grades 


mentary grades, and there is greater hetero- 
geneity at this level than later on. 

The vocational maturity of the total 
sample is just about at the eighth grade 
level, as indicated by the mean of 34.64 in 
Table 6, which is approximately at the mid- 
point of the span between the fifth and 
twelfth grades. The mean and standard 
deviation of the fifth grade would suggest 
that there is considerable “floor” under the 
Attitude test, since 98% of the fifth graders 
had scores above 15, and the lowest possible 
score is 0. The effective lower limit for 
the inventory has probably been reached, 
however, due to the reading difficulty of 
the items which is on the fifth-grade level. 
The “ceiling” for the Attitude test would 
appear to be adequate for subjects in sec- 
ondary school, since only 1% of the twelfth 
graders had a score as high as 47 out of a 
possible 50; but it remains to be seen 
whether it can be used with older subjects 
who might be in college, technical school, 
or other types of post-high-school training. 

Overlap between Grades on VM. One cri- 
terion of the effectiveness or utility of any 
instrument which purports to differentiate 
between groups or points on & continuum 
is the extent to which a low percentage of 
score overlap is achieved. Although there 
are several different methods for computing 
pereentage of overlap (Strong, 1943), one 


5 through 12. 


of the most meaningful takes the median of 
the score distributions to be compared as 
the cut-off or dividing point, between them. 
In Figure 4 this procedure was followed to 
determine how much overlap there was be- 
tween grades in total VM scores. Cumu- 
lative percentage ogives were plotted 
against VM score for each grade, and then 
percentages of overlap were read from the 
graph for any combination of two grades 
by determining the percentage of scores 
above the point on the ogive of a given 
grade which corresponds to the median of 
the next higher grade. For example: the 
median of the seventh grade is a VM score 
of 34. The ordinate through this point inter- 
sects the ogive for the sixth grade at the 
eightieth percentile. The difference between 
this point and the upper end of the ogive 
is 20%, which is the amount of overlap be- 
tween the VM score distributions of these 
two grades. 

The percentages of VM score overlap for 
all possible pairs of grades are summarized 
in Table 7. The diagonal percentages are 
for adjacent grades, and they are all in the 
30s, with the exception of the eleventh 
grade. As its ogive shows in Figure 4, the 
eleventh grade in this sample was atypical, 
since it was less like the twelfth grade in 
its VM score distribution than was the tenth 
grade. Whether this phenomenon is a func- 
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ТАВІЕ 7 
PERCENTAGES OF OVERLAP Or TOTAL Scores 
BETWEEN Grapes 5 THROUGH 12 WITH THE 
MEDIAN as А Cur-Orr Point 


(1961-62) 


Grade 


Grade —— - 

6 8 9 10 11 12 

5 30 11 4 3 4 2 

6 20 4 8 5 7 3 

7 38 25 17 24 12 

8 37 28 34 21 

9 38 47 38 

10 50 40 

11 32 
12 


tion of biased sampling at the eleventh- 
grade level or the normal course of voca- 
tional development during this period of 
high school needs to be studied further. 
That the sampling would be biased at just 
this one grade level, however, seems un- 
likely since exactly the same sampling pro- 
cedures were used for the eleventh grade 
as were followed for the other grades. For 
the latter there is a definite tendency for 
the percentage of overlap to be larger in 
the upper grades, but this may be an arti- 
fact of the ceiling imposed by the method 
used to select items. If the scoring key had 
been based upon the responses of college 
seniors, the differentiation among Grades 
9 through 12 might have been greater. It is 
also possible, however, that the twelfth 
grade represents the end point in the de- 
velopment of the vocational attitudes meas- 
ured by the Attitude test and that an 
asymptote has been reached beyond which 
further differentiation would not be ex- 
pected. 

VM and D scale Correlations. In addition 
to the above analyses, it was of particular 
interest to compute the correlations of VM 
and D with age and grade and with each 
other, for several reasons. If the Attitude 
test measures individual differences in vo- 
cational maturity, which is a developmental 
variable, then the VM total scores for 
groups should be highly related to age and 
grade, as they are; but VM total scores for 
individuals should be only moderately posi- 
tively correlated with age and grade. If they 


are too highly associated with these varia- 
bles, then the Attitude test would be little 
more than an inefficient measure of age and 
grade. The actual product-moment ps, 
based upon the total sample of 2,784 sub- 
jects, were as expected, being .385 between 
VM and age and .463 between VM and 
grade, both of which were significant be- 
yond the .001 level. Also, the value of t 
for the significance of the difference be- 
tween these r's, which was 10.82, exceeded 
the .001 level with 2,781 degrees of freedom 
(Walker & Lev, 1953). In this analysis, 
the correlation of age with grade was r — 
.908. For D the predicted relationships with 
age and grade were low ones, since one of 
the defining criteria for the deviation items 
was that they should be unrelated to these 
variables. The obtained r’s for D with age 
and grade were —.128 and —.159, respec- 
tively, both of which were significant at 
the .001 level, with 1,931 degrees of freedom. 
Finally, the r for VM and D, with the 
same degrees of freedom, was —.200, which 
again was significant beyond the .001 level. 
In terms of the ways in which the two scales 
are scored, this low negative correlation 
means that the subjects who are less voca- 
tionally mature tend to be more deviant in 
their responses—and possibly, therefore, 
more maladjusted. 

Conclusions. When it is considered that 
the Attitude test must differentiate both (a) 
between groups of subjects at different 
grades and (b) between different subjects 
at the same grade in order to measure indi- 
vidual differences in vocational develop- 
ment, the standard deviations in Table 6, 
the percentages of overlap in Table 7, and 
the correlations of VM with age and grade, 
as well as with D, are encouraging. To- 


gether, these data indicate that the Atti- | 


tude test measures behaviors which are | 
highly enough related to age and grade 


that they are developmental in nature, but 
not so highly related that they are the same 
as age and grade. 


DiscussioN 


The purposes of this investigation have 
been specifically (a) to construct and stand- 
ardize a measure of vocational maturity 0 
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adolescence and more generally (b) to de- 
vise and implement psychometrie proce- 
dures appropriate for the measurement of 
developmental variables. In the discussion 
which follows, the extent to which these ob- 
jectives have been realized is evaluated 
against the results which were obtained. 
Some extrapolations are then made for 
further research on vocational develop- 
ment. 


Measurement of Vocational Maturity 


The data from the initial standardization 
of the Attitude test indieate that, with 
certain qualifieations, verbally expressed 
vocational behaviors mature with increas- 
ing age and grade during adolescence, much 
as theories of vocational development have 
proposed. Cross standardization of the At- 
titude test needs to be done in order to con- 
trol for the possible effects of sample size 
and bias (Crites, 1964), but there is little 
reason to expect that the results would 
change appreciably upon replication. There 
is fairly conclusive evidence that the voca- 
tional attitudes measured by the inventory 
are general in nature, not being affected by 
either sex or school differences, and that 
they are more closely related to grade than 
age, Why grade is the more significant fac- 
tor in this aspect of vocational development 
remains to be determined; but one plausible 
hypothesis is that, through its guidance 
services and occupational orientation 
courses, the educational system is a primary 
agent of what might be called “vocational- 
ization” (Crites, 1958). This supposition 
is supported by the limited research which 
has been conducted on the role of the school 
in vocational decision making (Carlin, 1960; 
Wilson, 1959) and is suggested by the find- 
ing that, in the present study, grade was 
differentially related to choice process, in- 
decision, and work value items, all of which 
might be influenced more by educational ex- 
periences and tasks than by other factors. 

The point at which vocational develop- 
ment begins in childhood or early adoles- 
cence has not been established empirically, 
although Ginzberg, Ginsburg, Axelrad, and 
Herma (1951) assumed that it starts at 
approximately age 10 or at about the fifth 


grade. O'Hara (1959) has questioned 
whether this is too late on the basis of 
some exploratory interviews with 6-, 7-, 
and 8-year-olds; but it would appear 
from the reading difficulty level of state- 
ments which describe vocational decision 
making that, whatever other methods might 
be used with younger children, the effective 
lower limit of measurement with an inven- 
tory such as the Attitude test is the fifth 
grade, At the upper extreme, it is still 
unknown whether differentiation between 
individuals in their vocational attitudes 
can be achieved beyond the twelfth grade 
or not, but it seems quite likely, since high 
school seniors have a range in their VM 
scores which extends from 25 to 47, thus 
indicating that at least some of them have 
not eeached the end of their adolescent vo- 
cational development before entering youth 
or early adulthood. To study this problem, 
the Attitude test is currently being admin- 
istered in a large sample of business and 
technical schools as part of the Specialty 
Oriented Student Project (Hoyt, 1962) and 
will be administered to a representative 
sample of college students in the near future. 

That the eleventh grade deviated from 
the general trend of vocational develop- 
ment for the other grades raises the question 
as to why this particular grade was atypi- 
cal, if it is assumed that the sampling was 
not biased. To begin with, it should be 
noted that the variations in the eleventh 
grade item means, as compared with the 
tenth and twelfth grades, are not significant 
ones, otherwise the relationships with grade 
would not be monotonic. But, the differences 
are large enough that the total vocational 
maturity mean of the eleventh grade is 
slightly less than the tenth grade, and its 
score distribution is more like that of the 
ninth grade than any other grade. In de- 
scribing the occupational orientation of the 
eleventh grader (16-year-old), Gesell, Tlg, 
and Ames (1956) observe: 


Sixteen is more tentative and open-minded. He 
has a better appreciation of the complexity of the 
career problem. “I’m going to wait and see how 
things will turn out.” “Hard to know." “A new 
idea every week." [p. 360]. 


This attitude of “watchful waiting,” of not 
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Similarly, Van de Castle's (1962) findings 
on the perceptual maturity of 7-8 year olds, 
9-10 year olds, 11-12 year olds, and college 
with the true to 


cal, geometric figures to more complex 
differentiated ones. These similar find 
from: diverse sources obtained with diff 
measuring instrumenta suggest s f 
underlying factor or process which maj 
differentiate the attitudinal and concept 
behavior of younger and older individy 
It may be that, mueh like clients in payek 
therapy who overgeneralize and respo 
more or less automatically to stimuli ( 
durs, 1961), the vocationally immature i 
dividual may be deficient in his diserim 
tion learning (Spence, 1960), either t 
he is young and has not had the appropri 
experience or because he is older and b 
insufficiently developed this facility 

Stages in vocational development can 
defined in several different ways. For 
items in the Attitude test, all of which 
monotonically related to grade, the criteria 
was 4 significant difference between 
means of adjacent grade levels. O'Hara 
Tiedeman (1959) have proposed that th 
are three criteria of a stage: discret 
which refers to the surges and periods 
quiescence in development; dominanct 
which is the pre-eminence of one type o£ 
behavior over another during a given spas 
of development; and, irreversibility, whie 
means that development is continuous 
does not turn back upon itself. Accordi 
to O'Hara and Tiedeman, any one of the 
criteria is sufficient to define a stage. For 
example, in their study of the vocational 
self-concept in adolescence, they clas: 
essentially linear trends in aptitudes 
general values as “stages” because they 
irreversible, but not because they also mani- 
fested discreteness. It can be argued, how- 
ever, that both of these criteria—irreversi- 
bility and discreteness—are necessary {0 
define a stage, if development is conceptu- 
alized as progressive changes in behavior 
Thus, a nonmonotonic behavioral function 
of time, such as perceived social class status 
in O'Hara and Tiedeman's study, which is 
discrete but not irreversible, would not be 
interpreted as exhibiting developmental 
stages. Following this reasoning, stages in 
items of the Attitude test were identified 
only after they were shown to be monotonie 
functions of grade. In other words, the eri- 
terion of irreversibility was met first, 


bouki be pointed out that 

Per defining а sage have been 
from analyses of cross-sectional Is 
longitudinal studies, an adequate definition 
of a stagr in vocational development would 
have to take into aecount the changes in 
behavior which occur between pointe in time. 
Given the irreversibility of a 
appropriate procedure would be 
significant differences between corre: 
lated means of longitudinal samples tested 
on different occasions." 

One aspect of the measurement of vors- 
tional maturity with the Attitude test has 
mot yet been discumed: the low negative 
correlation between the VM and D scales. 
The empirical definition of a construct such 
&* vocational maturity is 
only by the relationships which 
measures of it have to other variables but 
also by the relationships which they do not 
have. If it can be shown in further research 

maladjustment, 


elarifying the 
tional maturity, on the one 
tional adjustment, on the other. 
definitions of these concepts have been pro- 
posed in the past, but they have often been 
contradictory and have had little or no 
heuristic value (Cowley, 1949; Super, 
1955; Super et al., 1957). In contrast, the 
operational definitions provided by the VM 
and D scales are explicit, and the relation- 
ship between them can be investigated em- 
pirically. Опе possible research problem 
Might be an analysis of the 
and social characteristics of the subjects 
who have various combinations of high and 
low scores on the VM and D scales. For 
example, it might be that 
well-adjusted, vocationally immature indi- 
viduals come from overprotective home en- 
* The author is indebted to his colleague, L. D- 
Goodstein, for suggesting this definition of a de- 
velopmental stage based upon longitudinal data. 
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of an individual's vocational behavior rela- 
tive to that of his peers (Crites, 1961). 
Although the combined rational-empirical 
approach has much to commend it as а 
method for measuring developmental varia- 
bles, there are nevertheless a number of 
issues or problems which are not resolved by 
it, primarily because they are more con- 
ceptual than methodological in nature. 
First, the use of the irreversibility, or mono- 
tonic, criterion for the selection of items 
is based upon the assumption that the 
process of vocational development is a 
continuous one (Ginzberg et al., 1951; Su- 
per, 1957). If there are certain behaviors, 
however, such as those measured by the so- 
called “regression” items, which occur late 
as well as early in the process, then it may 
be that vocational development is some- 
times discontinuous, characterized by slips 
backward as well as steps forward. The 
finding of Gesell, Ilg, and Ames (1956) 
that 15-year-olds are more indefinite and 
undecided about their career choices than 
either 14- or 16-year-olds is consistent with 
this hypothesis and accentuates the need 
for further research on it. Second, only 
recently has systematic attention been 
given to the problem of formulating sta- 
tistical models and developing instruments 
for the measurement of changes in behavior 
over extended periods of time (Harris, 
1963). The dilemma is to construct scales 
which are psychometrically reliable yet 
which are sensitive to the effects of de- 
velopment (Bereiter, 1962). One solution 
which has been conceived for the Attitude 
test is to partition the total score variance 
into components attributable to (a) matu- 
ration and (b) error, as determined through 
retestings of the subjects at intervals which 
range from 1 day to 1 year (Crites, 1964). 
Plots of the test-retest stability coefficients 
against the test-retest intervals should re- 
veal inflections in the curve which will 
identify the different sources of score vari- 
ance. Finally, there is the problem of se- 
lecting appropriate criteria for the valida- 
tion of a developmental measure like the 
Attitude test. Two considerations would 
appear to be paramount: (a) the criteria 
should themselves be developmental in na- 
ture, and (b) they should be moderately 


positively interrelated. In other words, they 
should define the construct, in the sense of’ 
a set of correlated variables, of which the 
test to be validated is presumably one as 
pect. 

Closely related to the validation of the 
Attitude test is the problem of assessing 
and possibly controlling, the effects of spe 
cific test factors upon variance in total vo 
cational maturity scores. Item type and re 
sponse format were systematically varied 
in the experimental forms of the Attitude 
test, and the results showed only that the 
true-false option produced better differen- 
tiation between ages and grades. Some post 
hoe analyses of response style, as mani- 
fested in the tendency to endorse items in ã | 
partieular way irrespective of their content, 
were made; but further research is needed. 
More specifically, response style must be 
studied independently of variations in item 
content, possibly by controlling the latter 
in either one of two ways: (a) compare 
percentages of true-false endorsements of 
equal sets of items which judges have classi- 
fied as vocationally “mature” and "imma 
ture”; or (b) compare responses to items 
worded in the negative, such as “You do 
not get into an occupation mostly by 
chance," with responses to the original items 
in the Attitude test. Investigation must 
also be made of the effects of acquiescence 
and social desirability response sets upon 
item endorsements. Since the Attitude test 
is a measure of verbal behavior, extraneous 
response sets could appreciably affect tht 
veridieality of the self-reports which it 18 
designed to elicit. Presumptive evidence 
from the reading difficulty level of the m- 
ventory and the trends in item responses 
across age and grade suggest that these fac- 
tors are only minimally operative, but more 
direct data on their significance must be 
obtained. 


Study of Vocational Development 


The general approach which has been 
followed in the construction and standard- 
ization of the Attitude test has been frankly 
descriptive and normative. Beilin (1963) 
has questioned the value of this type of 
orientation in developmental research, how- 
ever, on two grounds. First, he points out 
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that the concept of maturity implies unity, 
either as defined by a generalized factor or 
by an additive factor, both of which have 
shortcomings: 


Past difficulty with the unity idea has come, 
for one, from the inability to demonstrate "g" 
factors in particular developmental areas—as in 
the case with intelligence tests—where instead of 
growing reliance on a unitary conception of in- 


telligence there is increasingly greater respect for 
multifactorial conceptions of intelligence. Where 
maturity, on the other hand, has been repre- 
sented by an assumed additive unity it has been 
difficult to demonstrate that the elements can be 
added legitimately or that they change with age 
in predictable ways [pp. 780-781]. 


Second, he argues that the gathering of 
norms on developmental phenomena repre- 
sents a misdirected research effort: 


The danger of focusing on normative data and 
normative developmental conceptions is that it 
leads us away from learning about the funda- 
mental processes and mechanisms which lead to 
the behavior we have been describing [p. 781]. 


He goes on to suggest that research on 
voeational development might better be 
concerned with the role of cognition in de- 
cision-making and that.there should conse- 
quently be less emphasis upon naturalistic 
observations and more upon laboratory and 
field experimentation. 

There can be little quarrel with Beilin’s 
admonition that the processes, whether cog- 
nitive or learning or both, which are related 
to vocational development should be inves- 
tigated—preferably under highly controlled 
conditions. But, it is a nonsequitur that the 
normative study of vocational development 
necessarily interferes with such an enter- 
prise, or, indeed, that it cannot contribute 
significantly to it. In fact, the normative 
data which have been collected on the Atti- 
tude test would argue that just the opposite 
1 true, Although it was an unexpected find- 
ing, the fact that there was a trend in item 
means from predominantly true to false 
responses with progressions in age and 
grade suggested that one of the primary 
processes underlying vocational develop- 
ment may be discrimination learning. With 
the Attitude test as a selection instrument, 
it will be possible to test this hypothesis in 
the laboratory by comparing subjects with 


high and low vocational maturity scores on 
a variety of discrimination learning tasks. 
Another advantage of normative studies of 
vocational development, which is of equal 
if not greater importance, is that they can 
provide the link between the often artificial 
conditions of the laboratory and the reali- 
ties of everyday career behavior. It might 
be found, for example, that certain experi- 
mental treatments produce changes in the 
verbal behaviors measured by the Attitude 
test. If the normative characteristics of the 
test are known, extrapolations can legiti- 
mately be made to the extralaboratory sit- 
uation, and the generalizability of the find- 
ings can be increased considerably. 

The integration of psychometric research, 
such as that on the Attitude test, with ex- 
perimental work on the stimulus conditions 
which affect decision-making processes 
holds considerable promise for the profit- 
able study of vocational development in the 
future (Cronbach, 1957). Concept forma- 
tion (Bruner, Goodnow, & Austin, 1956), 
discrimination learning (Spence, 1960), 
learning sets (Harlow, 1949), utility for 
risk (Ziller, 1957)—these and other con- 
cepts from experimental psychology are di- 
rectly relevant to the analysis and explana- 
tion of vocational phenomena. The next 
step is to design and execute experiments 
based upon them which will test hypotheses 
about the role of the higher mental proc- 
esses in determining and regulating the na- 
ture and course of vocational development. 


SUMMARY 


Recently formulated theories of voca- 
tional development have proposed that, in 
contrast to earlier conceptions, vocational 
choice is a process which takes place 
throughout the period of adolescence, from 
approximately age 10 to entry into adult- 
hood at age 21. Consistent with this em- 
phasis upon the longitudinal nature of vo- 
cational decision-making is the concept of 
voeational maturity which has been intro- 
duced to describe the various behavioral di- 
mensions along which vocational develop- 
ment has been hypothesized as proceeding. 
Satisfactory empirical meaning has not 
been given to the construct of vocational 
maturity, however, for either theory testing 
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or practical application, not only because 
of lack of research but also because of the 
inherent diffieulties involved in the meas- 
urement of developmental variables. Con- 
sequently, as part of a long-term inves- 
tigation of vocational development in 
adolescence, the present study had two 
purposes: the psychometric one of con- 
strueting and standardizing a measure of 
vocational maturity, and the methodologi- 
cal one of attempting to solve some of the 
problems encountered in the assessment of 
developmental phenomena. 

The measure which was developed in this 
research is the Attitude test of the Voca- 
tional Development Inventory, which even- 
tually will also consist of the Competence 
test. These two tests have béen designed to 
define the cognitive and conative aspects of 
vocational maturity which theoretically ap- 
pear to be most closely related to the medi- 
ation of vocational choice behavior during 
adolescence. The methodological approach 
which was used in the construction and 
standardization of the Attitude test, and 
which will also be followed in the research 
on the Competence test, is based upon a 
synthesis of the principles of the rational 
and empirical models for test development. 
It integrates steps for writing items which 
are theoretically and substantively mean- 
ingful with procedures for selecting items 
which are related empirically to a relevant 
eriterion. For the Attitude test, items were 
written which described various concepts of 
the vocational choice process, feelings about 
making career decisions, work values, etc., 
and then the relationships of the items to 
age and grade, as the criterion variables, 
were determined. If the items were mono- 
tonically associated with age and grade, 
thus indicating that they followed the pro- 
totypic trend of development, they were ac- 
cepted for inclusion in the Attitude test. 

The results of the item analyses for age 
and grade based upon a total sample of ap- 
proximately 3,000 subjects tested with two 
experimental forms of the Attitude test, 
which systematically varied response for- 
mat and item type, led to the following 
conclusions: first, verbal vocational be- 
haviors are monotonically related to both 
age and grade. but are more frequently as- 


sociated with the latter than the former 
Second, a true-false response format pro. 
vides better item discrimination betweep 
grades than a Likert-type rating scale 
Third, items written in the first and thir 
person singular produced essentially the 
same amount of item differentiation across 
age and grade levels. Fourth, the most nota- 
ble trend in item response by age and grade 
was from predominantly true responses ir 
the elementary school years to predomi- 
nantly false responses in the high school 
years. Fifth, for the items in which "stages" 
were identifiable, they tended to occur pri- 
marily (30 out of 50 items) between the 
sixth and seventh grades or between the 
ninth and tenth grades, which are the tran- 
sitional points in the 6-3-3 school system / 
used in the study. Sixth, 10 so-called devia- 
tion response items were discovered which 
were not related to age or grade and which 
were answered as either true or false by 
only а small proportion (20% or less) of 
the total sample. The hypothesis was for- 
mulated that these items might measure 
vocational maladjustment as contrasted 
with vocational maturity. Finally, it was. 
found that there were very few differences 
between males and females and between 
schools in high and low rent districts on the 
items which differentiated between grades. 
In addition to the item analyses, several 
total vocational maturity score analyses 
were conducted which yielded the following 
findings: the average vocational maturity 
of the entire sample was at approximately 
the eighth grade; there was an increase in 
vocational maturity at all grade levels ех 
cept the eleventh grade, which was atypl 
cal; the correlation of vocational maturity 
with age was .385 and with grade was .463; 
and the relationship between vocational 
maturity and deviation responses, which 
may indieate vocational maladjustment, 
was low negative (r = — 20). j 
The implications of the results were dis- | 


* 


cussed with respect to three problems: the 
measurement of vocational maturity, the 
methodology of test construction, and the 
study of vocational development. It was 
proposed that a combined rational-empiri- 
cal approach to the measurement of de- 
velopmental constructs, such as vocational 
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maturity, has considerable promise not only 
for the normative standardization of an in- 
strument but also for the identification of 
those psychological processes which medi- 
ate vocational choice behavior. Specific 


studies which might be conducted in both 
the field and the laboratory were briefly 
outlined, and some lines of inquiry for fu- 
ture research on vocational development 
were suggested. 
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APPENDIX А 
TABLE Al 
VARIABLES IN. THE Аттітоов Teer or тик Vocationat DivgtorwxxT IxvssTORY 


= 
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Dimcosion Defaition Sampie tem 
Involvement in the Extent to which individual is se- “TI seldom think about the job I 
thoice process tively participating in the process of want to enter.” 
making a choice 
Orientation toward Extent to which individual is task- “Work is dull and unpleasant” 
work or pleasure-oriented in his attitudes and “Work is worthwhile mainly be- 
toward work and the values he places cause it lets you buy the things you 
upon work want." 


Independence in deci- 


Extent to which individual relies 


“I plan to follow the line of work 


sion-making upon others in the choice of an oe- my parents suggest." 
cupation 
Preference for voca- Extent to which individual bases “Whether you are interested in a 
tional choice factors his choice upon a particular factor job is not as important as whether 
you сап do the work." 
Conceptions of the Extent to which individual has “A person сап do any kind of work 
choice process accurate or inaccurate conceptions he wants as long as he tries hard." 
about making an occupational choice 
a 
APPENDIX B 
Form ПІ or THE ATTITUDE Test OF THE VOCATIONAL DEVELOPMENT INVENTORY 


AND Scogixa Keys ror THE VM AND D SCALES 


Directions 


Listed below are a number of statements about 
Occupational choice and work. Read each state- 
‘ment and decide whether you agree with it or 
disagree with it. If you agree or mostly agree with 
‘the statement, blacken the circle in the column 
headed T on the separate answer sheet. If you 

or mostly disagree with the statement, 

blacken the circle in the column headed F on the 
r sheet. Be sure your marks are heavy and 
k. Erase completely any answer you wish to 


де. 
` 1. You have to know what you are good at, 
and what you are poor at, before you can choose 
An occupation. 
` 2. Ask others about their occupations, but 
make your own choice. : 
3. It’s unwise to choose an occupation until 
You have given it a lot of thought. 1 
4. Once you make an occupational choice, you 
can’t make another one. 
5. In making an occupational choice, you need 
to know what kind of person you are. 
6. A person can do anything he wants as long 
as he tries hard. _ 
7. Your occupation is important because it de- 
termines how much you can earn. E 
8. A consideration of what you are good at is 
More important than what you like in choosing an 
Occupation. 
9. Plans which are indefinite now will become 
much clearer in the future. 
10. Your parents probably know better than 
anybody which occupation you should enter. 


cision. 

17. Sometimes you can't get into the occupa- 
tion you want to enter. z 

18. You can't go very far wrong by following 
your parent's advice about which occupation to 
enter. 

19. Working in an occupation is much like go- 
ing to school. 

20. The best thing to do is to try out several 
occupations, and then choose the one you like 


21. There is only one occupation for each in- 
dividual. , = 

22. The most important consideration in choos- 
ing an occupation is whether you like it. 

93. Whether you are interested in an occupa- 
tion is not as important as whether you can do the 
work. 

24. You get into an occupation mostly by 


chance. 
25. It’s who you know, not what you know, 
that’s important in an occupation. 


36 


26. Choose an occupation which gives you a 
chance to help others. 

27. Choose an occupation, then plan how to 
enter 4. 

28. Choose an occupation in which you can 
someday become famous. 

29. If you have some doubts about what you 
want to do, ask your parents or friends for advice 
and suggestions, 

30. Choose an occupation which allows you to 
do what you believe in. 

31. The most important part of work is the 
pleasure which comes from doing it. 

32. It doesn’t matter which occupation you 
choose as long as it pays well. 

33. As far as choosing an occupation is con- 
cerned, something will come along sooner or later. 

34. Why worry about choosing an occupation 
when you don’t have anything to say about it 
anyway. " 

35. The best occupation is one which has inter- 
esting work. ? 

36. I really can't find any occupation that has 
much appeal to me. 

37. I have little or no idea of what working 
will be like. 

38. When I am trying to study, I often find 
myself daydreaming about what it'll be like when 
Istart working. 

39. If I have to go into the military, I think 
ГЇЇ wait to choose an occupation until I'm out. 

40. When it comes to choosing an occupation, 
ГЇЇ make up my own mind. 

41. I want to really accomplish something in 
my work—to make a great discovery or earn lots 
of money or help a great number of people. 

42. As long as I can remember I've known 
what I want to do. 

43. I can’t understand how some people can be 
so set about what they want to do. 

44. My occupation will have to be one which 
has short hours and nice working conditions. 

45. The occupation I choose has to give me 
plenty of freedom to do what I want. 

46. I want an occupation which pays good 
money. 


Јонх О. 


CRITES 


occupation. 

48. I know which occupation I want to enter, 
but I have difficulty in preparing myself for it. 

49. I know very little about the requirements 
of occupations. 

50. I want to continue my schooling, but I 
don't know what courses to take or which occupa- 
tion to choose. 

51. I spend a lot of time wishing I could do 
work that I know I cannot ever possibly do. 

52. I'm not going to worry about choosing an 
oceupation until I'm out of school. 

53. If I can just help others in my work, I'll be 
happy. 

54. I guess everybody has to go to work sooner 
or later, but I don't look forward to it. 

55. I often daydream about what I want to be, 
but I really don't have an occupational choice. 

56. The greatest appeal of an occupation to me 
is the opportunity it provides for getting ahead. 

57. Everyone seems to tell me something dif- 
ferent, until now I don't know which occupation 
to choose. 

58. I have a pretty good idea of the occupation 
I want to enter, but I don't know how to go about 
it. 

59. I plan to follow the occupation my parents 
suggest. 

60. I seldom think about the occupation I want 
to enter. 


VM ScaLe 
4.F 15. F 97.'T 33.F Siam 
БР 16. Т 98. F 39.F 52H 
6. F 18. F 29. T 40. T 53. E 
7. F 19. F 30. T 4.F 54. F 
8.F 20. F apu 42. F 55. F 
10. F 21. F Some г "OCHE 
П.Е 23. Е Sgr 45. p 1572 
12. F 24. F 34.F — 48.F 58.F 
13. F 25. F Sore 49. p ° 502 
14. F 26. F 37. F 50. F 60. F 

D ScALE 
1. Е 3. Е WF 35.82 4634 
2. F 9. Е 22.P. .44 T. 4 
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EFFECTS OF PERINATAL ANOXIA AFTER SEVEN YEARS: 


NORMAN L. CORAH, E. JAMES ANTHONY, PAUL PAINTER, JOHN A. STERN, 
AND DONALD L. THURSTON 


Washington University School of Medicine 


Assessment. of cognitive and perceptual functioning, personality, and neuro- 
logical impairment was made of 235 7-yr.-olds. The sample was composed of 
134 children who were normal full-term newborns and 101 children who were 
anoxic full-term newborns. The group represented 85.5% of the sample which 
had been followed up at 3 yr. of age. Anoxics did not differ significantly from 
normals in intelligence and signs of neurological impairment. The anoxic 
group did show impairment in the areas of verbal abstract ability, perceptual 
skills, and social competence. In general, the anoxics showed minimal impair- 
ment of functioning. Attempts to predict current functioning from newborn 
measures of severity of anoxia proved to be highly unreliable. 


————— 


T. present research was concerned 
with an evaluation of the effects of peri- 
natal anoxia in 7-year-old children who 
had been studied at birth and at 3 years of 
age (Caldwell, Graham, Pennoyer, Ern- 
hart, & Hartmann, 1957; Graham, Ern- 
hart, Thurston, & Craft, 1962; Graham, 
Matarazzo, & Caldwell, 1956; Graham, 
Pennoyer, Caldwell, Greenman, & Hart- 
mann, 1957; Pennoyer, Graham, & Hart- 
mann, 1956; Thurston, Graham, Ernhart, 
Eichman, & Craft, 1960). The assumption 
involved in all such investigations is, of 
course, that anoxia produces brain damage 
and that this effect is detectable at all age 
levels, 

-_ 
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Pasamanick and co-workers (Knobloch 
& Pasamanick, 1959; Lilienfeld, Pasam- 
anick, & Rogers, 1955; Rogers, Lilienfeld, 
& Pasamanick, 1955) have postulated a 
“continuum of reproductive casualty” as 
resulting from complications of the peri- 
natal period. This continuum may be 
represented by death at one extreme and 
by a “syndrome of minimal cerebral 
damage” at the other extreme. It is the 
latter which is more difficult to delineate. 
However, the concept of a minimal brain- 
damage syndrome has received support by 
a number of other writers (Gesell & Arma- 
truda, 1941; Paine, 1962). 

While it is not possible to “prove” that 
small degrees of anoxia will produce brain 
damage in humans, such a demonstration 
has been made for lower animals (Windle 
1960, 1963). Windle (1963) has demon- 
strated that a period of asphyxia in Macaca 
mulatta which does not require resuscita- 
tion may still produce discrete brain 
lesions. There were no resulting behavioral 
deficits that could be measured. 

It is beyond the scope of this paper to 
review all of the available evidence which 
has accumulated on the issue of the effects 
of anoxia. Recent reviews of the pertinent 
literature have been provided by several 
authors (Bailey, 1958; Graham, Cald- 
well, Ernhart, Pennoyer, & Hartmann, 
1957; Keith & Gage, 1960). Retrospective 
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studies have generally demonstrated im- 
pairment as a funetion of anoxia. How- 
ever, most of these studies began with 
cases having known deficits, Prospective 
studies have, on the other hand, produced 
conflicting results, Some have shown à 
deleterious effeet of anoxia, while others 
have not. There are several possible ex- 
planations for the lack of agreement in 
the results of the prospective studies, A 
number of these problems has been dis- 
cussed by the previous investigators of the 
St. Louis project (Graham, Caldwell, Ern- 
hart, Pennoyer & Hartmann, 1957; Graham 
et al., 1962). 

These difficulties have included loss of 
cases in follow-up, the investigators’ knowl- 
edge of the subjects’ (Ss’) birth status, 
the criteria of anoxia which have sometimes 
included other complications as well (ef. 
MacKinney, 1958; Schachter & Apgar, 
1959), and the control of irrelevant vari- 
ables such as sex and socioeconomic status. 
A variable which has received too little 
attention is the age at which children with 
perinatal complications are tested. Recent 
research would suggest that differences be- 
tween brain-damaged and normal children 
in some areas of functioning will be appar- 
ent at all ages—some at earlier but not 
later ages; and some will not be apparent 
in early childhood, but can be demon- 
strated in later childhood or adolescence 
(Rudel, Teuber, Liebert, & Halpern, 1960; 
Teuber & Rudel, 1962). The present re- 
search has attempted either to minimize 
these problems or to deal with them 
explicitly. It is only with extreme care in 
these matters that future status as the 
dependent variable can be related to peri- 
natal status as the independent variable. 

The most comprehensive report of the 
selection at birth and 3-year follow-up 
studies in the St. Louis project is given in 
Graham, Ernhart, Thurston, and Craft 
(1962). Two methods of identifying anoxia 
were used: signs of respiratory delay 
or apnea, and clinical signs of prenatal or 
intrauterine anoxia. At 3 years of age, the 
anoxies were significantly poorer than con- 
trols in intelligence and related cognitive 
functions, on a few but not most ratings of 
personality traits, and the anoxies had a sig- 


nificantly greater number of positive 
suggestive j signs than did cog 
trols. There were no significant diffe 
on tests of perceptual-motor funet 
Significant, but low, correlations were f 
between the measures which differentia: 
the groups at 3 years of age and a progn 
score indicating severity oí perina 
anoxia. These results were interpreted 
giving support to the concept of a com 
tinuum of reproductive casualty, (Most 
the findings of the previous studies will 
considered in greater detail below.) 

For a number of reasons, the present i 
vestigation was undertaken when t 
children became 7 years of age. First, it i 
well known that most measures of cognitive 
and intellectual functioning at preschool 
ages do not allow a high degree of accuraey 
in predicting later functioning. M 
aspects of such functioning are just 
ginning to develop during this period 
Consequently, it was felt that much mom 
reliable measures of functioning in thi 
area could be obtained at age 7. (Areas 
functioning, such as that of psychophysio- 
logical responsiveness, which would not have 
been assessed very easily in the earlier stud- 
ies could now be included.) Hence, plant 
were made to evaluate more areas. The chil- 
dren were also at an age at which they could 
reasonably be expected to have started eithe 
the first or second grades in school. 
event would allow for other avenues of fu 
tioning to be explored, such as, teachers’ re 
ports of school behavior and an assessment 


perceptual-cognitive, neurological, 
tional-social, and psyehophysiological. 
first three of these areas were asse е 
the 3-year follow-up but not as extensiv 
as was possible in the current study. Sine 
the area of psychophysiologic functioning 
represented a departure from the previous 
investigations of the St. Louis project, this 
work will be presented elsewhere. There 
were several tentative hypotheses which 
guided this research. 1 

Perceptual-cognitive functioning is * 
particularly inviting area of study since it 
is amenable to reasonably objective asses 
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The most striking deficits in the 3- 
war study were found in this ares 
І et al., 1962). It was hypothesized 
the significant difference in intelligence 
aimed in the 3-year study would be 
! "This hypothesis was based on 
iperimental work of Kennard (1942), 
monstrated considerable plasticity 
brain functioning of early brain-dam- 
ӨЧ monkeys when compared to animals 
Mamaged at maturity. Further support 
from clinical observations which 
e that many of the symptoms of 
mic brain damage—such as, irritability, 
activity, and impulsiveness—appear 
binish as the child matures. At the 
, it was anticipated that abstract 
Hons as seen in vocabulary skills 
continue to differentiate the con- 
from the anoxic group. This hypothe- 
based on Hebb's (1949) suggestion 
‘early brain damage would affect 
ing structures in the brain which 
Ї retard development in this area. 
| also predicted that sensorimotor 
ons would not be as greatly affected 
arly injury. Whether or not this as- 
Поп would apply to perceptual-motor 
tioning in the child remained an open 
0n, since no differences were found in 
ea in the 3-year study. 
' anoxies in the 3-year sample had a 
er number of positive and suggestive 
logical signs than did controls. It was 
ed that this difference would still 
і evidence, and that the frequency of 
logical deficits might even show an 
Se for both groups because of the 
Ins of assessment at 3 years of age. 
onality or emotional-social func- 
ig is both one of the most interesting 
4nd, at the same time, one of the 
Пеш to assess. There is an exten- 
cal literature which has given rise 
ept of a brain-injured personal- 
me consisting of such traits as 
vity, impulsivity, distractibility, 
ity, and emotional lability (Bak- 
9; Bender, 1956; Bradley, 1955; 
& Peters, 1962; Eisenberg, 1957; 
Denhoff, & Solomons, 1957; Levy, 
Ver, 1958; Strauss & Kephart, 1955; 
& Davis, 1941). Graham and Berman 


(1961), in reviewing the researeh literature 
in this area, have coneluded that this syn- 
drome has not yet been verified by research 

The previous investigators attempted to 
differentiate between brain-injured traits 
and traits of maladjustment with little 
success (Graham et al., 1962). However, at 
least one study in this агса (Bolin, 1959) 
has demonstrated a greater number of 
fears, signs of anxiety, and immaturity in 
children whose birth was of long duration 
as compared with children whose birth was 
of short duration. Traits such as fearful- 
ness and anxiety would presumably be 
classified as traits of maladjustment rather 
than as brain-injured traits. The present 
research adopted several procedures to 
deal with this area. 

In addition to the assesment of group 
differences, plans were made to assess the 
relationship between severity of anoxia 
and functioning in each of these arena. 
Such a procedure might allow for a test of 
the concept of a continuum of reproductive 
casualty in a limited sense. 


of papers (Caldwell et al, 1957; Graham et al., 


1956; Graham, ; Caldwell, Greenman, & 
Hartmann, 1957; Pennoyer et al., 1956). 
Follow-up Ss 


The 3-year study selected 191 or 53% of normal 
newborns for follow-up since there were many more 
normals than were needed for comparison with the 
complicated groups. All of the anoxics which had 
survived the first few days of life were included in 
the follow-up. This group consisted of 132 Ss. The 
3-year study also included other complicated 
groups, but these were excluded from the 7-year 
follow-up because these groups were smaller and 


TABLE 1 
DISPOSITION OF THE 7-YEAR SAMPLE 


NS 


Sam; а 
ж Normal Anoxic Total 


Selected for follow-up 
and examined at 


3 years 159 116 275 
Lost 
Moved 13 6 19 
Uncooperative 3 3 6 
Intervening CNS 
trauma 0 1 1 
Not located 9 5 14 
Total lost 25 15 40 
% lost 15.; 12.9 14.5 
Examined 134 101 235 
Percentage of live 
newborn sample 
originally se- 
lected for follow- ш 
ир 70.2 76.5 72.8 


because of practical limitations. Loss of Ss in the 
3-year sample involved 168% of the normal new- 
borns and 12.1% of the anoxics. Complete details 
of the previous follow-up are given in Graham, 
Emhart, Thurston, and Craft (1962). 

All normal newborns and anoxics who were ex- 
amined in the previous follow-up were selected for 
study at 7 years of age. This group involved a total 
of 275 Ss. Attempts were made to locate all cases. 
It was possible to trace most of the families who 
had moved from the area to various parts of the 
United States and, in one case, even to Europe. 
Fourteen cases were not located although nearly 
every resource including public records, school sys- 
tems, old neighborhood friends and acquaintances, 
certified mail, and newspaper advertisements was 
used. In some cases, only part of the examination 
was given by researchers who visited families who 
could not come to the testing center. Families who 
had moved from the area were invited to partici- 
pate if they should visit the St. Louis area. Twelve 
of these 31 families did participate. As а rule, only 
those cases living within 100 miles of St. Louis were 
considered for examination except as noted above. 
Location efforts were made without knowledge of 
newborn status. 

А few weeks before the child's seventh birthday, 
а letter was sent to the parent requesting their par- 
ticipation in bringing the child in for examination. 
The study was referred to as one of “psychological 
and physical development.” Each family was of- 
fered a nominal sum of money for “their time” and 
transportation, lunch, and baby sitter expenses. If 
the family did not return the reply card enclosed 
with the letter within a week, telephone contacts 
were initiated to urge their cooperation. Approxi- 
mately 10% of the sample required urging after 
they were located. 

Of the 275 Ss selected for follow-up, 235 were 


Coran, ANTHONY, PAINTER, STERN, AND THURSTON | 


examined. The 40 lost cases of 14.5% of the samp 
were fairly evenly distributed between the twe 
groups. There is no evidence available to sugges 
that children's defects played any part in the los 
of cases, Of the 6 who refused to cooperate, all 
cept 1 appeared to be the result of family diaconi 
divorees and remarriages. One mother had strong 
negative attitudes toward the hospital where sh 
had been employed. Records collected by the 
vious investigators did not suggest that defects ш 
the children played any role. All of the reasons fer 
loss are given in Table 1. 


Newborn Measures 


The material which follows briefly summarize 
the various newborn measures and is excerpted from 
previously published accounts (Ernhart, Graham, 
& Thurston, 1960; Graham, 1956; Graham et al, 
1962; Graham, Pennoyer, Caldwell, Greenman, & 
Hartmann, 1957). 


Criteria for Selection of the Groups 


Normal Newborns. This group was composed 
full-term infants without evidence of significant 
complications. 

Perinatal circumstances were considered sat- 

isfactory only when there was a controlled 

spontaneous vaginal delivery or delivery by 
low forceps, respiration and cry were estab- 
lished in а few seconds, and color and activity 
were judged to be normal by the delivery room 

staff [Graham et al, 1962, p. 8]. 

Oxygen saturation of blood samples was not con 
sidered in making this classification. | 

Anozic Newborns. Clinical signs of fetal апохй 
or postnatal apnea were used to classify Ss 9 
anoxic. There were three anoxic subgroups: pre 
natal anoxies, postnatal anoxics, and perinatal 
anoxics or Ss with both prenatal and postnatal 
signs. All anoxies were full-term infants. Tht 
method of rating was specified in advance. Nom 
Ss had, of course, ratings of zero. These rating 
were weighted scores derived from the number 


Pennoyer, Caldwell, Greenman, 
Hartmann (1957). High scores reflected sever 
anoxia, while low scores indicated a milder cond 
tion. The mean prenatal ratings for the subgroup? 
in the 7-year follow-up were: prenatal anoxia su! 
group, .72; perinatal anoxia subgroup, 1.08. T 
mean postnatal ratings were: postnatal anoxia sub 
group, 1.13; perinatal anoxia subgroup, 1.34. 


Newborn Behavior 


Signs of Central-Nervous-System (CNS) D 
turbance. This clinical rating was obtained by 
weighting signs in the medical record. The weigh! 
ing system is given in Caldwell, Graham, Pennoye" 
Emhart, and Hartmann (1957). Table 2 presen 


TABLE 2 


D Ох or RaTINGs оғ CNS DISTURBANCE 
РА Newsorns REEXAMINED AT 7 Years 


CNS rating 
0 5-1,0 15-20 25-30 
134 0 0 0 
Prenatal 26 1 1 0 
~~ Postnatal 37 0 3 0 
Perinatal 107997 6 3 
T 215 7 10 3 


> 


е frequency distribution of CNS weights in the 
mined at 7 years. It is readily apparent 
at there were few Ss with such signs, 
ior Tesis. There were five newborn tests. 
first two were assumed to reflect newborn CNS 
opment, while the last three were assumed to 
e current nervous-system functioning. 
Maturation seale involved exposure of the 
to several stimulating conditions with the 
es being rated according to four categories. 
ision scale was a measure of eye movements 
tation and involved judgments of extent 
direction of visual pursuit. The Pain Thresh- 
test measured the intensity of a mild electric 
t which was just sufficient to elicit a response. 
Irritability rating involved the infant's sensi- 
to general environmental stimulation. Rat- 
muscle tone in the direction of flaccidity 
Піу yielded a Muscle-Tension rating. Com- 
details of these measures are given in Graham 
) and Graham, Matarazzo, and Caldwell 
). Newborn test data were not available for 
in the follow-up study. 


ostic Score 


two ratings of anoxia and the rating of 
disturbance were summed to provide an in- 
the severity of anoxia. 
5 could range from 0 to 9 with higher 
cores indicating a poorer prognosis. For scores 
to L5, the pediatricians against whom the 
was validated tended to consider the 
rognosis good. With scores between 1.75 
d 3.0, the prognosis was considered ques- 
ionable, and above 3.0, guarded [Graham et al., 
^ p. 13]. 
mean Prognostic scores of subgroups in the 
T study were: prenatal anoxies, .79; post- 
anoxics, 1.27; and perinatal anoxies, 3.14. 


3-Year Measures 


gain, these procedures will only be summa- 
d here. The following is excerpted from the 
Usly published reports (Ernhart, Graham, 
Marshall, & Thurston, 1963; Graham 
11962; Graham, Ernhart, & Berman, 1963). 

measures used for comparison with the 
Study are included. 
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Psychological Measures 


Cognitive Tests. This category included three 
measures, The 1937 Stanford-Binet Intelligence 
Scale, Form L, was used to assess intelligence and 
was supplemented by the Cattell Infant Intelli- 
gence Scale where necessary. A vocabulary mental 
age was derived from the Picture Vocabulary and 
Definition tests of Forms L and M, and the Word 
Vocabulary of Form L of the 1937 Stanford-Binet. 
The third measure of cognitive functioning was 
the Concepts test which involved several form- 
board problems using matching and sorting of 
blocks according to increasing levels of complex- 


ity. 

Perceptual-Motor Tests. This group included 
two measures. The first was the Copy-Forms test 
(Graham, Berman, & Ernhart, 1960) which re- 
quired that S copy the first 5 figures of the 18- 
figure test. The drawings were scored according to 
several form criteria. The second test was a Per- 
ceptual-Motor battery which included four sub- 
tests? a Figure-Ground Identification subtest, a 
Peripheral-Distraction subtest, a Tactual Locali- 
zation subtest, and a Mark-the-Cars subtest simi- 
lar to many tests of cancellation. 

Personality Measures. Both psychological ex- 
aminer ratings and parent questionnaire ratings 
were obtained. Examiner ratings of four brain- 
injured traits including hyperactivity, demanding- 
ness, impulsivity, and distractibility were obtained. 
Three signs of maladjustment were also rated: 
infantilism, negativism, and fearfulness. An over- 
all brain-injured composite score and a malad- 
justment composite score were derived from these 
two sets of ratings. 

The Parent Questionnaire was composed of 209 
items in 14 subscales. Two major scores were de- 
rived from these: a Brain-Injury score derived 
from 6 subscales and a Maladjustment score de- 
rived from 7 subscales. 


7-Year Measures 


In selecting the procedures for the 7-year ex- 
amination, several factors were considered. First, 
it was necessary to adequately sample the areas of 
functioning outlined above. Practical considera- 
tions made it desirable to limit the assessment to 
one visit to the hospital. The procedures insti- 
tuted required 5-6 hours including a lunch period. 
While the child was being evaluated, another mem- 
ber of the research team interviewed the child's 
mother and administered several questionnaires. 
The examination was carried out without knowl- 
edge of the newborn status. 


Cognitive and Perceptual Functioning 


Cognitive Tests. The two tests included in this 
category were the Wechsler Intelligence Scale for 
Children (WISC; Wechsler, 1949) and the Gilmore 
Oral Reading Test, Form A (Gilmore, 1951). All 
of the WISC subtests except Mazes were admin- 
istered. 


Three scores were obtained from the Gilmore 
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^^ inclination to manipulate The 
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8. Frustration Tolerance: (a) low—S was re- 
sistant to change, became disorganized with frus- 
tration, was excemively hostile when blocked; 
(b) high—S ignored frustration, was unresponsive 
and unchanged by demands that blocked his flow 
of behavior; (c) normal—S accepted. frustrations, 
but showed a desire to overcome or detour around 
them, aroused anxiety helped rather than hindered 
his solution to the frustration. 

9. Social Sensitivity: (a) oversensitive—S was 
excessively polite, conscientious or approval seek- 
ing in the social aspects of the interview situation, 
cared too much about his immediate impression or 
too little about the actual situation; (b) obtuse— 
S appeared dull, careless of the social demands, 
ignored his appearance or effect on the examiners ; 
(c) normal—S was aware of the social aspects 
(eg, his appearance, words, impression given to 
the examiners), but was free to act in a largely 
natural, unstilted 


responses as the interview situa- 
tion developed; (b) hyperflexible—S was easily 
distracted from previously learned useful re- 
sponses, lost the total trend of behavior in trying 
to meet each tiny demand; (c) normal—S molded 
his previously learned ways of dealing with adults 
to the peculiar situation, but did not overly ad- 
just. 

11. Communication: (a) low signal—S used too 
few words and gestures, communicated little in- 
formation or feeling, showed paucity of creativity ; 
(b) high noise—S used words and gestures ex- 
cessively and inappropriately, added movements 
to his play or sound effects to his words that de- 
tracted from his meaning; (c) normal—S used 
smiles, frowns, gestures, words that gave a clear 
impression of the ideas and feelings involved as 
ben a appropriate to his intelligence and cul- 


12. Overall Rating on Organicity. This last scale 
was also a 5-point scale varying from “No evi- 
dence for organicity” to “Gross organicity.” The 
rating of this scale was based on the following 
three factors: (a) the total clinical impression 
derived by the trained observer from the child’s 
demeanor, responses, thought processes, and psych- 
ological defenses; (b) a review of the first 11 
scales in which a piecemeal determination was 
made of the child’s deviation from the normal Я 
апі (с) а correction was made for emotional fac- 
tors which might cause some clinical and scale 
deviation, but which would not necessarily reflect 
an overall organic pattern—eg, a compulsive 
neurotic girl would appear similar to a mild or- 
ganically damaged child defending himself against 
his catastrophic fantasies by compulsive means. 

Since most psychiatrists feel that family history 
material is of importance in making clinical eval- 
uations, a psychiatrie questionnaire designed for 
this purpose was administered to the parents. This 
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questionnaire elicited information in six areas of 
the child's functioning: coordination; sensation; 
social life; emotional life; speech; attention, com 
centration, and memory. After the initial ratings 
of the child were completed, the psychiatrists had 
access to these questionnaires. The rating scales 
were filled out again with the benefit of this addi- 
tional information. This procedure yielded twe 
sets of ratings for each psychiatric examiner ов 
each case, 

For purposes of analysis, Ss in each treatment 
group were divided into Socioeconomic Status X 
Sex subgroups. An attempt was made to select 
from each of these subgroups an equal number 
of Ss rated by each of the two senior psychiatrist 
This was done randomly in so far as was possible. 
This procedure was used to equalize the effects 
of any Psychiatric Rater X Subgroup interaction 
that might exist. The final tally was 94 cases for 
the first rater, 101 for the second, and 14 for the 
psychiatric Fellow. 


Neurological Examination 


This evaluation was essentially the same as that 
given in the 3-year study (Graham et al., 1962; 
Thurston et al., 1960). Much of the following is 
excerpted from these reports. 

The examination was made by a pediatrician on 
the staff of St. Louis Children’s Hospital. This 
pediatrician had also conducted the 3-year exami- 
nations. However, the intervening years, number 
of cases, and growth of the children all dictated 
against his remembering any single case. A few 5 
were seen by pediatric residents who had been 
trained by the project pediatrician. The procedure 
included a brief, standard neurological examina- 
tion of the cranial nerves, reflexes, muscle tone, 
gait and postural performance, as well as observa 
tion of general reactions, speech, etc. Height, 
weight, head and chest circumference were als 
obtained. The notes from the examination wer 
recorded on a standard data form designed by 
the previous investigators. The examiner, as was 
the case with all personnel involved in the project 
evaluations, had no knowledge of the newbom 
classification and refrained from taking a history 
of that period. 

The results were classified according to the same 
System as in the previous study (Thurston et 
1960). Four general categories were used: normsl 
—those Ss for whom there were no recorded devi 
ations; essentially normal—included deviations 


which may be indicative of minimal CNS injury: | 


but which would usually be considered as devia- 
tions falling at the extremes of the normal range: 
suggestive—included deviations which suggest that 
some kind of CNS damage is present; positive— 
included Ss who presented definite neurological 
signs indicative of damage to CNS. The Ss wert 
categorized by a pediatric neurologist in terms 

their most severe signs although others may have 
been present. Table 3 lists the classification 


TABLE 3 
IFICATION Or NINUNOLOGICAL Dara 


o findings present 


ly normal—one or more of the following 
tions: 


psychological responses observed 
examination, as excessive fear, nega- 
n, distractibility 
turbance 


of febrile convulsions or other convul- 


ор or heterotropias 
assymetry 
dination of extremities 


lexi 


і апу one of the following, whether 

Or not accompanied by the above Essentially 

] deviations: 

tric hypotonia of one side or one 
ty (3 enses) 

ic hyperreflexia (3 cases) 

iive facial weakness (1 case) 

Ogic gait (1 case) 

y incoordination (2 cases) 

ed speech development (1 case) 

ital motor weakness (1 case) 

ble cranial nerve deafness (1 case) 


пу one of the following, whether or 
accompanied by the above Essentially 
nal deviations: 


al hyperreflexia and probable aphasia (1 


(1 case) 

ed hyperreflexia (2 cases) 

led hyperreflexia and possible hemipare- 

case) 

led hyperreflexia and cranial nerve palsy 
) 


ed hyperreflexia, lateral ankle clonus 
left facial weakness (1 case) 
ite facial weakness (1 case) 


gical data in terms of these categories. 
18 represented only once. 


Data Preparation 


istical Control of Irrelevant Variables 


Variables considered for their possible 
Were age, sex, race, private-clinie status, 
I grade. School grade was considered а 
variable only for the reading test. The 
Weights for grade (by semester) on the 


* were only 6 of 80 Negroes under private 
Care. Consequently, race and status were 
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correction, 
sumed that this bias in 
against the major hypotheses with which the study 
was concerned. The: 


X three-way (white private, white clinie, and 
Negro) classification of the data. This permitted 
the dependen 


TABLE 4 


Male 


White, private 20.1 31.7 
White, clinic 12.7 11.9 
Negro 14.2 17.8 
Total 47.0 61.4 
Female 
White, private 17.9 13.9 
White, clinic 14.9 8.9 
Negro 20.1 15.8 
Total 52.9 38.6 


10 
TABLE 5 
Vantastes ros Wiss Marsans 
Wins Aorceres 
- Se ES Ee MÀ 
Dee Mamm ٠١١ شد‎ eb out 
Cognitive 
wise 
Verbal IQ Status 
Performance IQ Status 
Full Beale IQ Statue 
All eubtests except Cod 
Status 
Sex 
Gilmore Oral Test 
All scores — Grade, status 
Perceptual Motor teat Status 
Embedded Figures Status 
Personality 
Parent ratings " 
Vineland SQ Status . 
Teacher ratings 
Distractibility = 
Rigidity 
Anthropometric 
Head cireumference Sex 
Chest circumference Sex, age 
Height Status, sex 
Weight Age 
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instances, the reasons for m. 
fortuitous, The family might only allow 
examination ; some members of the голеней 
might be suddenly called away for mom 
eases. The reading test results were based 

uced № since а number of f 
. Any special considers: 
are considered with the 
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Cognitive and Perceptual Functioning 


Effects of Anozia 


The results from the cognitive and per 
ceptual tests are presented in Table 6. 
There are several findings worthy of atten- 
tion. The significant difference in IQ whieh 
was found in the 3-year study is по longet 
evident although the same trend is present 
in the data. An inspection of WISC Full 
Scale IQ distributions, which are presented 
in Figure 1, also suggests the same trend së 
was present in the 3-year data (cf. Graham 
et al., 1962, pp. 24-25). The overall anoxit 
distribution is slightly displaced toward 
the lower IQ range. An interesting differ- 
ence in the 7-year and 3-year distributions 
is that the adjustments made on the 7-year 
seores did not produce distributions that 
were as markedly peaked as was the cas. 
with the 3-year scores, " 

Since it had been predicted that the dif- 
ferences in IQ would be attenuated at age 
7, an attempt was made to test for a sig- 
nificant change in IQ favoring the anoxi¢ 
group. The 3-year Binet IQs were trans- 
formed to Wechsler IQ equivalents 
subtracted from WISC IQs. The results are 
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TABLE 6 
Derresesces ix Cooxrnivs AND Peecerrest Messoans Эте кал "ws Torsi А олыс 
амь тиш Моама!, (Gaocr 
- س‎ 
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юэ» из 100,77 в н 
1.6 16.0 юа ца “ 
104.57 ищ ш.и њо ia 
10.61 10 10.20 10 LI 
10.8 зз эм зю Za 
10.21 247 LR 2% тг 
1.00 12 10.75 30 Е 
11.05 3.05 9.75 1m io 
10.46 28 on зм 1.72 
пи 29 . 010 зш 15 
Arrangement п.м ERI] 10.48 3.13 IE J 
i 10.27 э 9.08 2.94 1.00 
11.42 3.2% 1.13 im Ld 
9.08 2.8 10.0 30 E 1 
wear IQ 3.13 зи и na 0 
Oral Reading Test* 
Accurac 12.67 6.72 юм on 1.91 
Comprehension 16.97 5.17 16.05 6.10 1.07 
— Reading Rate 61.00 25.47 SS вц „э 
ieeptual-Motor 27.70 5.70 ъз тш 2.4 
Wedded Figures» 9.00 3.55 8.46 3.4 1.17 
er Attention 18.83 5.60 17.41 6.5 1.77 
„ the anoxic group № was 98 for this measure. 
Ns for this test were 103 and 70, respectively. 
< 01. 
in Table 6. There was no significant ties and Comprehension entails little more 


nee in IQ increase between the groups 
lough the change is in the predicted 
Hon. Both groups demonstrated sig- 
leant (p < .01) increases in IQ from 3 
! years of age. a 
"he only measure from WISC which 
onstrates a significant difference is the 
"bulary subtest. This result ا‎ 
"t with the 3-year vocabulary 
Suspect that the Vocabulary subtest 
Y be the best measure of abstract ability 
! WISC at age 7. When the scoring cri- 
! of some of the other verbal subtests 
Маз Similarities and Comprehension 
compared with those for Vocabulary, 
TS seem to be ae First, the 
e range for average ormance 
-year-old is less restricted on the 
Y subtest than on the others. 
Average performance on Similari- 


The results from the reading test scores 
are all in the predicted direction although 
not significantly different in 


tive data. Many of the children could not 
read at all. An attempt was made to deter- 
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The difference, tested by chi square, 
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Pereeptual-Motor test. In terms of stand- 
ard score units, the anoxic deficit on this 
measure is the same as that for the Vo- 
cabulary subtest (—.42). This result is 
different from that of the 3-year study in 
which no perceptual-motor deficit was 
found. This finding would suggest that the 
reason for the lack of such a difference at 
3 years of age was a function of the ability 
not having developed sufficiently to show 
а deficit. That there was no difference on 
the Embedded-Figures test is surprising in 
view of the results of other researchers on 
the effects of brain damage and the rela- 
tive sensitivity of Embedded-Figures meas- 
ures to detect such deficits (Cobrinik, 
1959; Teuber & Weinstein, 1956). Perhaps, 
the fact that the anoxic group can be con- 
sidered at best minimally brain damaged, 
as well as the relatively young age of Ss, 


l perceptual 
are considered leave the issue undecided. 
There is one other finding of considerable 
interest evident in Table 6. If all of the 


measures exclusive of IQ scores are eo 
sidered in terms of the number of times t 
anoxic group has a lower mean score th 
the normal group, we find that the ano 
group score is lower on 15 out of 17 
comparisons, There is also а tendency 
their standard deviations to be great 
than those of the normals (13 out of 1 


significant differences were obtained û 
only two of the cognitive and percept 
measures. 
Туре and Severity of Anozia 
Although the anoxic group may be dé 
vided into subgroups in terms of type 
anoxia as well as in terms of severity 
anoxia on the basis of the prognosti 
scores, the two methods are not in 
pendent. The postnatal anoxics obtained 
higher mean Prognostic score than did th 
prenatal group, and the perinatal grou 
obtained the highest mean Prognostic sco 
of all. However, the classification of re 
sults both ways has merit. Table 7 present 
the cognitive and perceptual results ас 
cording to type of anoxia. Each subgrou 
mean was tested against the appropriat 
normal mean by ¢ test. 4 
The Vocabulary performance was sig 
nificantly poorer in the postnatal 
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Perinatal groups while the prenatal group 

does not differ from the normal group. The 

ual-Motor test differed significantly 

"іу for the postnatal subgroup. 

significant differences which 

might best be considered in relation to 
sults according to severity 

E 8 presents these measures 

of the Prognostic scores. 

p mean was tested хе 

riate normal group mean by t 

On the classification according 

"erity of anoxia, the Vocabulary and 

m l-Motor test scores differ f 

and guarded groups but not 

ttain group. Just why this nonlinear 

telationship should occur is not clear. The 

“ignificant differences on Picture Arrange- 

ment and Coding in Table 7 and on Infor- 
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TABLE 8 
7-Yean PERFORMANCE ON THE Coonrtive Tasks as А FUNCTION OF NEWBORN 
Proonostic Scorn 
Anozk prognostic score 
25-15 175-40 TET 
Tot Good Uncertain Guarded 
O = 61) N= 22) [^ = 16 
M 3D M 50 M 30 
Cognitive 
Wisc 
Verbal IQ 102.06 15.16 98.93 12.80 98.38 22.41 
Performance IQ 102.17 15.60 101.57 16.71 104.77* 16.099 
Full Scale IQ 102.67 14.86 90.87 12.13 99.03 22.01 
Information 10.84 2.89 9.39° 2.19 9.40 3.92 
Comprehension 9.99 3.32 9.40 2.94 10.13 4. 
Arithmetic 10.25 2.93 9.34 3.06 9.67 2.87 
Similarities 10.81 3.83 10.95 3.88 10.25 4.39 
Vocabulary 9.92* 3.31 9.86 2.70 9.04* 4.19 
Digit Span 9.83 2.82 9.50 2.50 10.14 3.45 
Picture Completion 11.12 * 3.04 11.47 2.30 10.53* 4.05 
Picture Arrangement 10.50 3.25 9.99 2.72 10.75^ 3.33 
Block Design 9.84 2.68 9.54 2.86 9.30* 3.99 
Object Assembly 11.07 3.48 11.34 3.18 H.11* 3.48 
Coding 10.38 2.91 9.79 3.28 9.34* 3.19 
Gilmore Oral Reading Test*^ 
Accuracy 9.42* 8.11 11.13 9.88 12.50 12.50 
Comprehension 16.24 5.17 16.29 6.87 15.08 8.25 
Reading Rate 65.68 26.48 47.92* 28.20 49.67 28.66 
Perceptual 
Perceptual-Motor 25.37* 7.28 26.83 8.12 23.41** 8.07 
Embedded Figures 8.77 3.44 7.74 3.06 8.27 3.72 
Perceptual Attention 18.43 6.17 16.00* 7.20 15.50* 6.40 


a N = 15 for this M. 


* Ns for these groups were 42, 16, and 12, respectively. 


*р < 05. 
"ts «0L 


correlations of the three scores with school 
grade were as follows: .50 with Accuracy, 
.33 with Comprehension, and .50 with 
Reading Rate. There appeared, however, to 
be no better way in which to handle these 
data. In general, the overall trend in the 
reading test data suggests that anoxies 
tend to show some deficit in this area. 

It is interesting to note that the test of 
Perceptual Attention emerges as a signifi- 
cant measure for the perinatal anoxia 
subgroup and for the uncertain and 
guarded prognostic groups. These results 
would suggest that this test may only be 
sensitive to greater deficits and will not 
detect minimal differences. However, the 
differences between deficits on this test and 
those on the Perceptual-Motor test as 
found in Tables 7 and 8 might indicate 
that these measures are tapping somewhat 


different functions. The former measurt 
appears to be somewhat directly related t6 
severity of anoxia, while the latter is not 
This problem will be considered more di- 
rectly in a later section. 


Personality Ratings 


Psychological Examiner Ratings 


The first four behavior ratings produced 
J curve score distributions, and difference’ 
were tested with the Mann-Whitney U {е 
(Siegel, 1956). The last four categorical 
ratings were analyzed by chi-square tests 
It was necessary to collapse adjacent 
categories to provide a sufficient number 0 
cases within each category. Anxiety 8n 
activity were divided into two categories 
representing the first and last three cate 
gories, respectively, of the original scales 


TABLE 9 
Psyenotocicat Ехлмїхкң Rarıxas 
= ` eas 

Norma! Aww 

N = u0) We Mean- 

( м” ©. Wiitsey U 
2.00 3.00 5307 .0* 
2.00 100 5447.6 
2.00 3.00 5560.0 
2.00 1.00 5041.5 

Normal Anoajc $ x 
63.8 67.7 

36.2 32 96 

ty-Im- 

30.0 37.4 

40.2 43.4 

23.8 19.2 1.57 
63.8 74.7 

36.2 25.3 3.10 
18.5 18.2 

26.9 25.3 

54.6 56.6 10 


le other two bipolar rating scales were 
to three categories each. The re- 
ts are presented in Table 9. 
it may be readily seen that of the be- 
Wior rating scales only impulsivity and 
ibility were significantly different 
two groups. While the differences for 
pendency and rigidity were in the pre- 
ted direction, they were not statistically 
cant. None of the four categorical 
discriminated between the groups. 
eral, the differences are negligible. 
le analyses of the four categorical rating 


` TABLE 10 
BAR Examiner RATINGS AS A FUNCTION OF 
ТҮРЕ or ANOXIA 


Anoxic subgroup Md 


Perinatal 

A) WE ms) 
3.00 4.00** 2.00 
2.00 3.00* 2.00 
2.00 3.00 2.00 
1.00 3.00 2.00 
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TABLE 11 
Mapian 7-Yuam Exauixgn Ratixos as A Fuxe. 
TON or Newsonx Puogsosric Score 


—— س‎ 
_ чеч son 
Rating zu ditis [ZU 
(<=) (9. ш) Wem 
Impulsivity 3.00* 2.00 3.00 
Distractibility 3.00" 9 50** 
Dependency 3.00 2.00 2.00 
Rigidity 2.00 2.00 2.00 
* p < 05, um DETAIL 
** p « 01. 


are somewhat puzzling While the sub- 
group with the best prognosis was signifi- 
cantly more distractible than the normals, 
the groups with a poorer prognosis were 
significantly less distractible than the nor- 
mals, Whatever the explanation of these 
results, the overall data give little support 
to the usually hypothesized clinical brain- 
damage syndrome. 


Parent and Teacher Ratings 


The results from the parent rating meas- 
ures are presented in Table 12. The anoxics 
achieved a significantly lower mean Vine- 
land SQ than did the normal group. An 
inspection of the distributions for this 
measure, which are presented in Figure 2, 
reveals a downward displacement of the 
entire anoxic group. This result is consistent 
with that of Graham, Ernhart, Thurston, 
and Craft (1962) for their intelligence test 
data. There is no trend in the data which 
would suggest a second mode for the anoxic 
group or, by implication, an “all-or-none” 
effect for the consequences of anoxia. 

Since the Vineland SQ proved to be the 
most discriminating measure in the project, 
an attempt was made to evaluate the 
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TABLE 12 


T-Yman Parexr Rarinos 


= eee = 
Normal Asas 
Rating ee (CX p w =n) П 
м зр м s0 
Vineland SQ 15.38 13,10 106.92* 13.51 3.80% 
Activity 14.90 4.43 15.81 5.15 1.44 
Distractibility 11.50 3.50 11,45 3.79 .91 
Rigidity 8.15 2.43 7.95 2.53 ‚61 
Impulsivity 15.50 4.07 15.74 5.08 23 
мє = B e m — 
Normal % Anoxic % x 
Adjustment* 
Maladjusted 13.1 23.2 4.02* 
Motor eoordination* 
Poor 88.5 88.0 01 


* N = 96 for this M. е 


* Ns for this measure were 130 and 99, respectively. 
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< 05. 
< 001. 
groups on each of the Vineland subcate- 
gories. This procedure might clarify the 
basis for the difference in the groups. The 
subcategory measures produced very re- 
stricted or nonnormal distributions. Con- 
sequently, it was not possible to correct the 
subscores for sex-status differences, and 
the tests were made with the U test. These 
results are given in Table 13. 

The anoxics obtained significantly lower 
scores on Self-Help (Eating and Dressing), 


30 


20 


PERCENTAGE FREQUENCY 


9l- 101- 
100 "o 


Occupation, and Socialization. Considera- 
tion of these categories in relation to the 
age range (6-11 years) in which Ss failed 
items reveals that there are more items 
contributing to these categories than is 
true of the nonsignificant subscales. The 
only exception is Communication which 
has the most items in this range. Hence, 
most of the significant subcategory results 
may be attributable to differential item 
frequencies rather than to specific be- 


— Norma! Group 
(N* 134) 

77---- Anoxia Group 

(N* 96) 


= 
I3I- 14l- 151- 161- 17i- 
140 150 160 170 180 


П 121- 
120 130 


SOCIAL QUOTIENT 
Fic. 2. Distributions of Vineland Social Maturity Seale SQ for the normal and anoxic groups. 
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13.2 13 13.1 13 
n.2 12 10,7 10** 
11.5 12 10.9 ne 
7.0 7 6.9 7 
1.2 1 1.1 1 
10.6 n 10.0 )o** 
7.5 7 7.4 7 
5.2 8 7.9 s" 


p < 01. 
"p < 001. 


deficits. In general, these results differed significantly from the normals in 
est that the anoxics appear to show terms of maladjustment. 

ore dependent behaviors than normals. The teacher ratings are presented in Ta- 

[һе only other parent measure whieh ble 14. None of the differences was signifi- 
Significant was the Adjustment rating. cant. It is interesting to note that differences 

ificantly greater number of anoxies оп the first four ratings which were sub- 

ted as maladjusted. The full analy- jected 4-4 analysis оп the control variables 
parent ratings according to type and and usted where necessary were not 

erity of anoxia will not be given since it even in the predicted direction. The same 


р Vineland SQ was consistently and 
ng to both modes of classification. 


Шу the postnatal anoxic group (28.2%) any consistent trend r 
d the uncertain prognostic group (34.8%) tion. On subsequent analysis in relation to 


TABLE 14 
7-Year TEACHER RATINGS 
Normal Anoric 

N* 

E M so M SD 
120,93 12.77 5.50 11.61 4.74 
120,93 11.48 4.96 10.96 4.78 
120,93 6.78 3.26 6.43 2.92 
120,93 10.60 5.00 9.73 4.19 

Normal % Anoxic % 
ne 34.2 31.9 
43.6 41.5 
22.2 26.6 
aa 32.8 28.0 
Hb 17.0 26.4 


are listed in order for the normal group and the anoxic group. 
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type and severity of anoxia, a significantly 
greater number of the postnatal subgroup 
(34.2%) and the subgroup with a guarded 
prognosis (46.2%) were rated as poor in 
Motor Coordination. 


Independence of the Ratings 


The intercorrelations of the examiner 
ratings are presented in Table 15. It is not 
too surprising that the two ratings on which 
significant differences occurred also gave 
the highest intercorrelation. It would seem 
reasonable that a child who is impulsive 
is also likely to be distractible. Another 
interesting finding is that most of the cor- 
relations are positive. This would suggest 
that, typically, an examiner who saw one 
kind of deviant behavior in an S would be 
likely to see other deviant traits in the same 
S whether based on specific test behaviors 
or global impressions. 

The intercorrelations of the parent and 
teacher ratings are given in Table 16. All 
of the correlations are significant (p < 
05). It is interesting to note that, within 
each of these sets of ratings, consistency 
tends to be even higher than was the case 
with the examiner ratings. There is even a 
tendency for the teachers to show the 
greatest consistency of all. However, while 
the Vineland SQ is significantly related to 
other parent ratings, these correlations are 
much lower. This result would suggest that 
the different measures are tapping some- 
what different attributes of the Ss. While 


one might not expect a very high degree of · 
relationship between the parent rating 
scales and the Vineland SQ, some data to 
be presented below suggest that the rela- 
tionship ought to be higher. 

There were four traits rated in common 
by the various raters. The relationships 
among the raters on these traits are given 
in Table 17. The teacher and parent show à 
higher degree of consistency than do either 
of the other two combinations. However, 
even those relationships which are signifi- 
cant are too low to be very meaningful, 

It is of interest to compare the Vineland 
SQ with the four Examiner Behavior Rating 
Seales. The correlations were: Impulsivity, 
—.39; Distractibility, —.29; Dependency, 
—.30; and Rigidity, —.28. All of these 
correlations are significant (p < .01). The 
Vineland correlations with these examiner 
ratings are consistently, although not sig- 
nificantly, higher than the correlations 
between the Vineland and the other parent 
ratings. The only plausible explanation for 
these results would appear to be that the 
examiner behavior ratings and the Vineland 
Social Maturity Scale deal with much more 
specific behaviors than is the case with the 
other rating scales. Consequently, they are 
probably both more valid and more reliable 
measures of personality functioning. 

The results considered in this section 
would suggest that each of the raters had а 
different frame of reference to use in evalu- 
ating Ss and, within his own frame of 


TABLE 15 
INTERCORRELATION OF ExAMINER RATINGS 
(N — 232-234) 
ob 

Rating 2 3 4 5 6 1 8 
1. Impulsivity .60** .02** .95** .14* .40** .A0** .06 
2. Distractibility POT 09 23** — 30 —.14* 
3. Dependency .18* 15* .14* .20** .03 
4. Rigidity 14* 11 A8 = 
5. Anxiety Tii 13 —.18* 
6. Compulsivity-impulsivity .43** ;24** 
7. Activity .12 
8. Rigidity-lability 
ILI RPM MM MN a aen uu we m mu. us sco 

*p < 05. 


py «01. 


It is instructive to note that, in line with 
the preceding comments, those measures 
which resulted in significant group differ- 
ences tended to be the ratings which dealt 
with very specifie behaviors. The incon- 
sistent direction of results from many of 
the teacher and parent rating scales may 
well have been a function of an internal 
frame of reference irrelevant to the intent 
of the rating task itself. However, when 
the behavioral items become sufficiently 
specific, they may overcome the internal 
frame of reference or halo effect. 

The fact that the Examiner Behavior 
Rating Scales correlated significantly with 
the Vineland SQ might also suggest that 
the examiner's general frame of reference 
was more “objective” than that of the par- 
ent. It may be again emphasized that the 
examiner had no knowledge of birth status 
of the children to influence his judgments. 
Thus, it would appear that items that ask 
whether or not a child does some specific 
thing as opposed to those which ask how 
frequently he engages in activities of a 
given type will lead to more valid results. 


TABLE 17 
CORRELATIONS AMONG THE RATERS ON 
SIMILAR TRAITS 


1 Trai Examiner EE. Nue 
^n and parent and Tease QN Р) 
Activity Qo e GE 17* 
Distractibility 08 04 .25** 
Rigidity .09 15" 19** 
Impulsivity .10 .00 21** 
„2 < 05. 
p< Ol. 
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TABLE 16 
INTERCORRELATIONS OF PARENT AND TEACHER RATINGS SEPARATELY 
Parent rating 3 " Xx 
Ratings (N = 229-252) Topa Prt 
2 3 4 5 1 "m3 pei 
1. Activity .62 .49 .67 —.2 .63 57 E 
2. Distractibility EL .56 —.22 .07 .56 
3. Rigidity 43 = 15 .58 
4. Impulsivity —.21 
5. Vineland SQ 
"reference, each rater was fairly consistent. Psychiatric Ratings 


The extreme categories on the 11 bipolar 
rating scales were used too infrequently 
for purposes of analysis. Consequently, the 
two extremes were combined with their 
adjacent categories resulting in three- 
category scales. The Organicity rating 
scale (Number 12) was collapsed to four 
categories by combining the fourth and 
fifth categories. All of the analyses were 
earried out with chi squares. The results 
are presented in Table 18. 

The use of the Parent Questionnaire 
tended to increase discrimination of the 
groups except on two scales—Anxiety and 
Communication. Significant differences be- 
tween the normal and anoxic groups occur 
on only 5 of the 12 scales. Significant 
differences occur on Emotional Control 
(anoxies more "explosive" and Social 
Sensitivity only with the use of the Parent 
Questionnaire. The Social Sensitivity rating 
with a greater number of anoxies cate- 
gorized as "obtuse" is consistent, with the 
results from the Vineland Social Maturity 
Scale. Much of the data for both measures 
came, of course, from the same source—the 


parent. 
The findings of a greater number of 
anoxies rated as hyperflexible on the 


Flexibility seale is consistent with the 
examiner rating of Distractibility. Hyper- 
flexibility was defined in terms of dis- 
tractibility. The difference between the 
groups on Communication is an interesting 
one. The difference occurs in the high-noise 
category. This result would suggest that 
although some of the anoxics are clearly 


verbal and expressive, the expression is 
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7-Year PsvemiaTRIC RATINGS 


A V ЕЦ бина NEEDS cM o ee EEUU pes io rines 


*p < 05. 
**p < Ql. 


— — س‎ =—вг —— 
Without Parent Questionnaire With Parent Questionnaire 
Rating scale % cach category — — E ach серу 
Normal Anoxic Normal А поліс 
O = 119) (У = 0) (N = 116) (N = 0) 
1. Depth of Relationship х? = 1.96 x! = 3.94 
Deep 6.7 9.0 6.9 10.3 
Normal 68.9 50.6 69.8 5.3 
Shallow 24.4 31.5 23.3 33.3 
2. Rate of Development of Relationship х? = 5.30 x! = 4.45 
Fast 16.0 29.2 20.7 32.2 
Normal 57.1 47.2 56.0 42.5 
Slow 26.9 23.6 23.3 25.3 
3. Activity х? = ,53 х? = 2.96 
Hyperactive 29.4 32.6 42.2 51.7 
Normal 53.8 53.9 50.0 37.9 
Hypoaetive 16.8 13.5 7.8 10.3 
4. Attention and Gvassatésilón. 2 x! = 3.24 х? = 3.95 
Distractible 22.7 21.3 26.7 32.2 
Normal 54.6 44.9 51.7 37.9 
Perseverative 22.7 33.7 21.6 29.9 
5. Emotional Control х? = .69 х? = 6.13* 
Explosive 33.6 34.8 39.7 52.9 
Normal 44.5 39.3 40.5 24.1 
Compulsive 21.8 25.8 19.8 23.0 
6. Anxiety x! = 5.05 x? = 2.86 
Lacki 21.0 34.8 31.0 42.5 
Normal—focal 47.1 37.1 40.5 33.3 
Diffuse 31.9 28.1 28.4 24.1 
7. Play х? = .44 х? = 3.50 
Concrete 28.6 32.6 31.0 39.1 
Normal 39.5 36.0 36.2 24.1 
Fantastic 31.9 31.5 32.8 36.8 
8. Frustration Tolerance х? = .14 x? = 1.51 
High 5.9 6.7 5.2 9.2 
Normal 63.0 60.7 55.2 49.4 
Low 31.1 32.6 39.7 41.4 
9. Social Sensitivity x? = 3.64 x? = 5.97* 
Obtuse 14.3 24.7 12.1 23.0 
Normal 61.3 53.9 61.2 46.0 
Oversensitive 24.4 21.3 26.7 31.0 
10. Flexibility x? = 6.09* x? = 6.93* 
Hyperflexible 6.7 15.7 6.0 16.1 
Normal 58.8 44.9 50.0 36.8 
Rigid 34.5 39.3 44.0 47.1 
11. Communication х? = 11.04** x? = 7.69* 
High noise 4.2 18.0 6.0 18.4 
Normal 69.7 56.2 64.7 54.0 
Low signal 26.1 25.8 29.3 27.6 
12. Organicity x? = 8.42* x? = 13.68** 
Moderate to severe 4.2 10.2 4.3 11.6 
Mild 11.9 21.6 14.8 29.1 
Suggestive 23.7 25.0 26.1 26.7 
No evidence 60.2 43.2 54.8 32.6 
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a 

well-interfering and reduces the effective- 
' Bese of the communication process, 

The results of the overall rating on 
icity suggest that clinicians are quite 
-eapable of detecting behavioral cues which 
-mre considered to be indicative of neuro- 
deficit even when all of these cues 
are not clearly specified. It may be noted, 
however, that only 60.2% (54.8% with the 
use of the Parent Questionnaire) of the 
were categorized as showing no 
dence of organicity. This finding would 
est that the raters were somewhat 
sensitive" to possible signs of or- 


ganicity. 

"The analysis of the data in terms of type 
of anoxia added little to the results already 
nted and will not be given here. It is 
sient to indicate that the obtained 
ences came primarily from the defi- 
in the postnatal and perinatal anoxic 
ps. An analysis of the data by 
ostic classification was not carried out 


90 Ss were rated by both senior 
atrists, it is possible to consider the 
: of interrater agreement for the 
'arious scales. These data are presented in 
e 19. First, it may be noted that the 
entage of agreement is not very high. 
is also a tendency for agreement to 
with the use of the Parent Ques- 
1 . This result would suggest some- 
hat differential usage of this information 
‘the raters. It should be noted that, 
gh the rating scales were developed 
y by the two major raters, no at- 
is were made to insure consistency of 
ge. These raters had somewhat different 
tical viewpoints, and consistency 
not necessarily be related to validity. 
ıe three-category ratings resulting from 
combination of extreme categories, 
ich was done for purposes of analysis, 
oduced somewhat greater interrater agree- 
but not as much as would have been 
eted if disagreements occurred pri- 
ly on the extent of a trait deviation. 
result indicates that most of the dis- 
ement between raters occurred in the 
the middle normal category and 

er extreme. When agreement was eal- 
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TABLE 19 


Ркнскхтлов оғ [NTERRATER AGREEMENT ON THE 
Pevomarac Rating Scans 


"— N 82-90) 
Without parent data With parent dota 
Rates Number ot Categories Nember of categories 
7 r m r 
1 05 68 » 02 
2 м 64 5 61 
3 61 65 45 б 
4 M 49 n ы 
5 42 50 4 52 
6 и зә 2 29 
7 33 43 3 43 
8 ы 50 47 55 
9 47 LU 40 42 
10 52 5 48 E 
11 58 м 55 % 
12 49 51> 49 л» 


* For the analysis, the two extreme categories 
were combined with the adjacent categories. 
* Based on four categories. 


culated on all five categories allowing one 
category of disparity in judgment, the 
agreement rose to a range of 69-97% with 
a median of 85%. 

These results raise a point of considerable 
interest. No claim for a very high degree of 
reliability can be made for these rating 
scales. It may also be noted that neither 
those scales with the poorest interrater 
agreement nor those with the “best” agree- 
ment gave any significant differences be- 
tween the groups. 

A typical conclusion of researchers who 
use personality ratings is that it is very 
difficult to obtain reliable measures. It is 
further assumed that if more reliable 
measures were available, more significant 
results would be obtained. However, an 
alternative conclusion might just as easily 
be made: if consistent differences along 
some dimension occur in groups of Ss under 
study, even fairly unreliable measurement 
techniques will detect them. The clinical 
significance of our positive findings will be 
discussed in a later section. 


Neurological Findings 


The anthropometric measures and neuro- 
logical findings are presented in Table 20. 
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TABLE 20 
7-YeaR ANTHROPOMETRIC DATA AND NEUROLOGICAL FINDINGS 
ү Е x 3 2 Normal Anoxic 1 
Measure Ne — t 
M SD M SD 
Head 120, 95 51.96 1.79 51.71 1.52 1.10 
Chest 119, 93 55.79 3.85 57.31 3.53 2.97** 
Height 120, 95 48.49 2.15 49.63 3.48 2.97** 
Weight 123, 93 53.64 9.25 55.13 9.23 1.17 
Normal % Anoxic % х? 
Neurological 
Suggestive and 
positive 115, 95 7.8 12.6 .89 


^ Ns are listed in order for normal and anoxic groups. 


кр < 01. 


The suggestive and positive neurological 
categories were combined for analysis be- 
eause of the small number of Ss in each 
category. The normal and essentially nor- 
mal categories were also combined. The 
results were analyzed by chi square. 

The units of measurement for the anthro- 
pometric measures were centimeters for 
head and chest, inches for height, and 
pounds for weight. The anoxies had larger 
chest circumferences and were taller than 
the normals. These tendencies had been 
present in the 3-year measures, but were 
not significant although weight had been 
significantly different in the earlier study 
(Graham et al., 1962). These differences 
are not readily interpretable. About all 
that may be said of them is that there is no 
indication of physical underdevelopment 
as a function of perinatal anoxia. A re- 
classification of the data in terms of type 
of anoxia or prognostic groups revealed no 


TABLE 21 


POSITIVE AND SUGGESTIVE NEUROLOGICAL SIGNS 
AT 3 AND 7 YEARS OF AGE 


Classification? Number of cases 

3 years 7 years Normals Anoxics 
P and 8 Missing 2 1 
P and 8 P and S 7 
P and § N and EN 9 15 
N and EN P and 8 8 8 
Missing P and 8 1 2 


Note.—Abbreviations are: P, positive; S 
suggestive; N, normal; EN, essentially normal. 


trends which would aid in interpreting 
these findings, and those data will not be 
presented here. 

The analysis of the neurological findings 
Suggests a tendency for a larger proportion 
of anoxies to show positive and suggestive 
signs of neurological deficit than was true 
of the normals. However, this difference is 
not significant. The proportion of the nor- 
mal group with such signs is nearly identi- 
cal to that obtained in the 3-year study 
(7.2%, Graham et al., 1962). However, the 
proportion of such signs in the anoxic group 
dropped from 21.1% at 3 years of age to 
12.6% in the present study. 

A question may be raised as to the nature 
of the difference between the findings in the 
3-year study and the present results. Are 
these differences a function of the cases lost 
from the 7-year study? Did some members 
of the anoxic group show recovery or com- 
pensation? Are the present results unrelated 
to those of the previous follow-up study? 
A comparison of the 3-year and 7-year 
findings may provide an answer to these 
questions. Table 21 gives the classification 
in both follow-up studies of the Ss who 
were classified as showing positive or sug- 
gestive neurological signs in at least one of 
the studies. 

The data are easier to follow in terms of 
the actual number of cases rather than in 
terms of percentages. There were 11 nor- 
mals and 23 anoxics in the 3-year study 
classified as showing positive or suggestive 
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signs. The corresponding number of cases 
in the present study was 9 normals and 12 
anoxics. It can be seen from the data in 
Table 21 that three cases evaluated in 
each of the follow-up studies did not re- 
ceive a neurological examination in the 
other study. 

It may be noted that, of the normal Ss 
seen in both studies, there were no con- 
sistencies in the findings. Those who 
showed positive or suggestive signs at one 
age were classified as normal or essentially 
normal at the other age. Of the 22 anoxies 
classified as positive or suggestive at 3 
years of age and seen in the present study, 
only 7 or approximately one third of the 
group were so classified at 7 years of age. 
The other 15 Ss in this group were classi- 
fied as normal or essentially normal in the 
present, study. Only 3 anoxie Ss classified 
as normal or essentially normal in the 
earlier study were classified as positive or 
suggestive in the present one. 

These findings would suggest that there 
is a somewhat greater tendency for anoxics 
{о show consisteney in signs of neurological 
deficit than is the ease for normals. How- 
ever, the data do not necessarily support 
the contention that recovery from earlier 
signs of impairment may occur. The data 
from the normal group suggest that most 
of the positive and suggestive signs of 
deficit encountered in these Ss are not 
sufficiently stable to warrant firm conclu- 
Sions as to their importance. No analyses of 
heurological signs in terms of type or se- 
verity of anoxia were carried out since 
many of the cells had too few cases to war- 
Tànt a precise test. 


Prediction of 7-Year Effects 


One of the major concerns of the present 
study was with the relationship between 
early functioning and behavior at 7 years 
of age. The newborn ratings and behavior 
lest scores and the 3-year measures were 
ре for comparison with the present 

ata, 


Newborn Ratings and Behavior Tests 


F our ratings were available for correla- 
tions with the 7-year measures: signs of 


prenatal anoxia, signs of postnatal anoxia, 
signs of CNS trauma, and the prognostic 
scores which combined all three. The cor- 
relations of these four ratings with the 7- 
year measures are presented in Table 22. 

The correlations with prenatal ratings 
are partial correlations with the effects of 
postnatal anoxia removed. Similarly, the 
correlations with postnatal ratings are 
partial correlations with the effects of pre- 
natal anoxia removed. The incidence of 
signs of CNS trauma was so infrequent 
that the correlations with this rating are 
for the anoxie group only. The prognostic 
scores were the sum of the other three 
ratings. 

There is only one significant correlation 
with the postnatal rating and none with 
the prenatal rating. It may be noted that 
these ratings correlate higher with each 
other (.40) than with any 7-year measure. 
The significant partial correlation between 
postnatal rating and the Reading Rate 
score is consistant with the subgroup anal- 
ysis in which only the postnatal anoxics 
differed significantly from the normals on 
the latter score. However, when the number 
of correlations is considered, very little 
may be attributed to only one which is 
significant. 

Most of the correlations for the anoxic 
group with the CNS rating are also not 
significant. It is of interest that both the 
parent ratings and the teacher ratings of 
motor coordination are significantly re- 
lated to the CNS measure. Both are in the 
predicted direction. The only other meas- 
ure which relates to the CNS rating is the 
Psychiatrie Organicity rating. ۱ 

There are a number of measures signifi- 
cantly correlated with the newborn Prog- 
nostic scores. However, these correlations 
have questionable meaning when their size 
is taken into consideration. While most of 
the significant relationships are consistent 
with those group analyses which were sig- 
nificant, one may only conclude that pre- 
diction of future deficits from the birth 
ratings is of questionable value. 

A series of correlational analyses was 
conducted on the relationship between the 
7-year measures and the newborn behavior 
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TABLE 22 
CORRELATIONS BETWEEN 7-YEAR MEASURES AND THE NEWBORN RATINGS* 
Measure Prenatal rating Postnatal rating CNS rating Prognostic score 7 
= nostic score 
Cognitive 
WISC 
Verbal IQ — .05 —.05 —.09 —.13* 
Performance IQ —.06 .06 —.03 .00 
Full Seale IQ —.04 —.03 —.12 —.12 
Information — .06 — .02 —.17 —.14* 
Comprehension .00 —.03 .04 — .03 
Arithmetic — .02 —.07 —.05 —.09 
Similarities —.02 —.01 —.09 — .06 
Vocabulary —.07 —.07 —.11 —.18** 
Digit Span —.03 — .05 -00 — .09 
Picture Completion —.03 03 —.16 —.03 
Pieture Arrangement —.04 —.02 —.02 —.06 
Block Design —.02 —.05 —.15 —.12 
Object Assembly —.02 03 —.04 —.01 
Coding " —.10 09 —.18 —.04 
Gilmore Oral Reading Test 
Accuracy —.03 —.13 .12 —.03 
Comprehension .01 —.12 .00 —.10 
Reading Rate .05 —.22* —.19 —.18* 
Perceptual 
Perceptual-Motor —.01 —.11 —.17 —.17* 
Embedded Figures —.05 —.03 —.05 —.07 
Perceptual Attention —.05 —.06 —.11 —.12 
Personality 
Examiner ratings 
Impulsivity —.06 07 —.12 -00 
Distractibility —.01 —.01 — .02 .02 
Dependency —.02 .09 —.10 .08 
Rigidity -04 02 04 12 
Anxiety —.01 -03 3n .07 
Compulsivity-impulsivity .00 .02 .07 .04 
Activity —.08 -05 .04 -06 
Rigidity-lability —.01 .02 —.04 .00 
Parent ratings 
Vineland SQ —.05 = 11 .07 —.18** 
Activity .04 .01 —.05 .03 
Distractibility .04 —.06 .14 —.01 
Rigidity .06 — .06 .07 —.01 
Impulsivity .01 —.02 .07 .00 
Adjustment .01 —.01 —.07 —.14* 
Motor coordination 25107 .00 —.23* —.05 
Teacher ratings 
Activity —.08 —.06 Sti —.07 
Distraetibility —.02 —.08 .07 —.02 
Rigidity —.02 —.01 .08 —.01 
Impulsivity .00 —.04 .19 —.02 
Achievement -00 = 15 — .08 
Adjustment .01 —.18 —.15 =.04 
Motor Coordination .00 -00 — .26* —.14* 
Psychiatric ratings> 
Organicity I .07 .10 .24* 1909" 
Organicity II .09 .13 .20 .93** 
Anthropometric 
Head — .06 —.01 ll —.05 
Chest .12 .02 .08 .10 
Height .05 .05 14 +16" 
Weight .09 .00 .03 .04 


a Prenatal and postnatal relationships are partial correlations. Correlations with CNS Ratings are 
F the anoxic group only—N = 50-100. АП other correlations are for both groups combined—N = 173- 
b The first rating is without the Parent Questionnaire and the second rating was made with the use of 
the questionnaire. 
*p < 05. 
**»« Ol. 
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A measures. These correlations tended 

to be even lower than those with the new- 
born ratings. Only three correlations with 
tbe mean newborn behavior score were 
significant. One of these was with Vineland 
8Q (29) for the anoxic group only. The 
very few correlations with the individual 
newborn behavior test scores which were 
significant were again too small to be of 
any interpretative value. Consequently, 
these analyses will not be given here. 

Again, it must be concluded that the new- 
born measures are of questionable value-in 
predicting later deficits. One may do nearly 
as well with the categorical information as 
to whether or not a given individual was 
anoxie at birth. Many of the measures 
which showed significant group effects were 
not consistently related to severity of 
anoxia on the subgroup analysis. 


3-Year Measures 


The correlations between the 3-year and 
7-year cognitive and perceptual measures 
are presented in Table 23. These results 
are of interest from several vantage points. 
The correlations among the various intel- 
lectual measures are consistent with pre- 
vious studies which have related such 
measures at these ages. The primary reason 
that only moderate correlations were ob- 
tained appears to relate to the lack of 
stability in such measures obtained in early 
childhood (cf. Anderson, 1939; Honzik, 
1938; Honzik, Macfarlane & Allen, 1948; 
Pinneau, 1961). In addition, both of our 
groups showed a significant inerease in IQ. 

The 3-year Concepts test correlated al- 
most ‘as much with the 7-year perceptual 
measures as with the intellectual measures. 
Conversely, the 3-year perceptual measures 


TABLE 23 


CORRELATIONS BETWEEN 3-YEAR AND 7-YEAR COGNITIVE AND PERCEPTUAL MEASURES 


7 year measures 
Cognitive 
WISC 


IN THE COMBINED NORMAL AND ANOXIC GROUPS 


(N — 163-232) 
3-year measures 
- Cognitive Perceptual 
Binet 1Q Vin Р 
E. Concepts а М6 
Verbal IQ -55 Se 9 oA = 
Performance IQ -37 46 i 6 E 
Full Seale IQ 5 bu 2 22 4 
Information 47 43 s А aT a7 
Comprehension -36 28 ы 23 a4 
Arithmetic -38 Jes a 18.1 
Similarities .38 P^ їч b 17 
Vocabulary 49 ja T и 18 
Digit Span -39 0 17 “134 ar 
Pieture Completion -29 {29 35 98 125 
Pieture Arrangement 39 = “28 17 112^ 
| Dime Design e a 94 E .23 
| ject Assemb] 8 00" 
cating y En 19 .00 .29 -19 
ilmore Oral Reading Test 
Accuracy и .29 D He 4 m 
Comprehension :85 А — 03s 09" 07" 
| Reading Rate 18 P : 
ereeptual 
Perceptual-Motor -39 Re d E м 
Embedded Figures 38 a 27 21 15 
ereeptual Attention 24 : 


* Not significant at the .05 level. All other coefficients are significant (p < .05). 
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TABLE 24 


CORRELATIONS BETWEEN j-YkAm EXAMINER 
Examiner RaTINGS iN. Count 


RATINGS or BRAIN INJURY TRAITS AND 7-YEAR 


NED ANOXIC AND NORMAL Grours 


(N = 225-233) 
m — ڪڪ‎ = 
1-Year Ratings эш Brain MC "2 

Activity Impulsivity Demandingness — Distractibility Composite 
Impulsivity 07 -05 -04 -08 .06 
Distractibility —.01 -.07 -.02 -03 —.04 
Dependency .05 .05 .14* .12 .09 
Rigidity 04 .09 .07 .08 .08 
Anxiety 04 17* .05 .12 1 
Compulsivity-impulsivity .18** .19** 11 .13* .16* 
Activity A B .15* .13* 413 
Rigidity-lability 06 02 -00 00 .02 


ا ا —————— 


*р < 05. 
** p < 01. 


correlated higher with 7-year intellectual 
measures than with the perceptual meas- 
ures. This latter result is consistent with 
the conclusion that perceptual performance 
as measured at 3 years of age was just 
beginning to emerge and was not clearly 
separable from intelligence. 

The correlations between the 3-year and 
7-year examiner ratings are given in Tables 
24 and 25. In general, all of the corre- 
lations are either very low or not signifi- 
cant. Where the correlations are signifi- 
cant, fairly definite trends emerge. The 
7-year ratings of compulsivity-impulsivity 
and activity are significantly related to the 
3-year brain-injury traits. The only 7-year 
rating which is consistently and signifi- 


cantly correlated with the 3-year malad- 
justment traits is Anxiety. | 

The relationships between the 3-year 
and 7-year parent ratings are presented in. 
Table 26. There would appear to be a 
greater degree of consistency in these 
ratings than was true for the examiner 
ratings. However, it may be noted that the 
parent raters were the same on both occa- 
sions. No great significance may be at- 
tached to the finding that the correlations 
with the 3-year Brain-Injury composite 
are higher than those with the Maladjust- 
ment composite. The traits contributing to 
the latter were different from those in the 
present study, and the measures were ob- 
tained in a different manner. Furthermore, 


TABLE 25 
CoRRELATIONS BETWEEN 3-YEAR Examiner RATINGS OF MALADJUSTMENT Traits AND 7-YEAR 
ExAMINER RATINGS IN THE CoMBINED ANOXIC AND NORMAL GROUPS 


(N = 227-232) 
3-year rati 
7-year ratings justment traits 
Infantilism Negativism Fearfulness Composite 
Impulsivity .00 —.01 —.12 —.05 
Distractibility -05 — .06 — .08 — .04 
Dependency .06 .06 —.06 .02 
Rigidity .13* -08 .09 13 
Anxiety .16* 15* .15* „219 
Compulsivity-impulsivity .04 11 —.11 .02 
Activity —.01 —.02 —.16* —.06 
Rigidity-lability —.06 —.01 —.08 —.07 
*p < 05. 
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TABLE 2 

CORRELATIONS BETWEEN 3-Yean AND 7-Үкаң 
PanxsT RariNGS IN THE CouBINED ANOXIC 

AND NORMAL Gnours 

(N = 229-233) 

SSE‏ = — ڪڪ 
S-year Parent Questionnaire‏ 
Birain-Inj Maladjustmen‏ 
te composite :‏ — 


Tear ratings 


~.16* 
.30°* 


Vineland SQ 
К. 24** 
.18** 00 


4 
Activity 
Distractibility 
Rigidity 
Impulsivity .36** 
Adjustment .20** 
Motor coordination 04 12 
Li 


* p < 05. 
**р< 01. 


the Vineland SQ correlates poorly with 
both composite scores. 

The correlations between the 3-year and 
T-year anthropometrie measures were as 
follows: .54 for head circumference (p < 
01), .10 for chest circumference, .65 for 
height (p « .01), and .03 for weight. The 
correlation for head size is not surprising 
since this measure changes the least with 
аде. The trend present in the 3-year sample 
Which became significant at 7 years with 
Tegard to chest size suggested that the cor- 
Telation would be significant. The reasons 
for this low relationship are not clear. 
Weight, of all these measures, is probably 
most susceptible to wide fluetuations. 

While there are a clearly greater number 
of significant relationships between 3-year 
nd 7-year measures than between new- 
born and 7-year measures, the order of 
these relationships is still fairly low. These 
Correlations would suggest that a great deal 
Of error would occur in any attempt to pre- 

ct 7-year status from 3-year measures. 
One would hope that sufficient development 

аз occurred by 7 years of age to make 
Коно of future status a more reliable 
air, 


Frequency of Abnormal Findings 


The previous investigators (Graham et 
ûl, 1962) attempted to determine the num- 
r of deviant individuals to be found in 
each of the groups. This procedure always 


entails the selection of an arbitrary cutting 
point. In the case of the 3-year study, the 
lower 2% of the normal group distribution 
on several measures was used. The present 
study used a somewhat more generous 
cutting point. 

It should be noted that this approach in- 
volves а somewhat different question than 
the group analyses. One group may per- 
form at a consistently inferior level when 
compared with another group on a series of 
measures, However, different individuals 
may be contributing to that lower mean 
performanee on different measures. To put 
it another way, a child may do poorly on a 
measure of performance in one area of 
functioning. He may or may not do poorly 
in other areas. The question, then, becomes 
one of the consistency of functioning in 
different areas. It was hypothesized that 
the anoxies would show a greater degree of 
such consisteney than would the normals. 

'The measures used for this analysis were 
selected on the basis of two criteria. First, 
reasonably significant group differences 
must have been obtained on the measure. 
Second, the measure had to be available for 
nearly all Ss. The cutting point used on the 
various measures was any performance 
equal to or below the performance of the 
poorest 16% of normal Ss. This range would 
be equivalent to that beyond one standard 
deviation below the mean of a normal dis- 
tribution. 

It was not possible to maintain the exact 
use of this cutting point for all measures. 
Some were categorical, and others produced 
a number of ties. Where this occurred the 
selection tended to go in favor of fewer 
cases. Two such analyses were conducted. 
'The first involved four test scores. Table 
27 gives the results of this analysis. 

It may readily be seen that a consistently 
greater proportion of anoxies do poorly 
than do the normals. The differences in 
number of individuals showing deviant 
performance are significantly different at 
all levels by chi-square analysis (p < .05). 
When a criterion of poor performance on 
three or more measures is employed, 5.396 
of the normals and 14.9% of the anoxics 
are included. When the minimal degree of 
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TABLE 27 
Fnrovrscv or Deviant Бахромов wiru Four Test MyASURES 
— = = m — = — 
h Anoxic > 
њан “wr i” CEN 
N ^ N % 
Vineland 8Q 2 16.7 39 41.5 
WISC Vocabulary 24 18.2 35 37.2 
Perceptual-Motor test ?0 15.2 23 24.5 
Test of Perceptual Attention 21 15.9 м 25.5 
Individuals deviant on one or 
more of the above 57 43.2 68 72.3 
On two or more of the above 23 17.4 38 0.4 
On three or more of the above 7 5.3 M 14.9 


Note.— Deviant findings were defined as а performance equal to or below the performance of the poor: 


est 16% of the normal Ss. 


deficit present in the anoxic group is con- 
sidered, this separation would appear to be 
fairly good. d 

However, it was felt that the inclusion of 
some personality measures would be de- 
sirable. Three ratings were added to the 
group of four tests. The results of this sec- 
ond analysis are presented in Table 28. 

The results from the second analysis are 
of considerable interest. First, it may be 
noted that there is a slightly higher pro- 
portion of normals found to be deviant on 
one or more measures than is true of the 
anoxics. There are two opposing tendencies 
operating to produce this result. It would 
be expected that a greater proportion of 


anoxies than normals would be selected anoxics. The differences between the 
TABLE 28 
FREQUENCY or DEVIANT FINDINGS WITH SEVEN MEASURES 
Normal і 
и Ei iy 
uu SE O N ern a йй 
N % N % 

aa it LLU A aS BA 
Vineland SQ 22 16.7 39 41.5 | 
WISC Vocabulary 24 18.2 35 37.2 
Perceptual-Motor test 20 15.2 23 24.5 
Test of Perceptual Attention 21 15.9 24 25.5 
Examiner rating—impulsivity 24 18.2 30 31.9 
Examiner rating—distractibility 22 16.7 29 30.9 
Parent rating—maladjusted 17 12.9 21 22.3 
Individuals deviant on one or more 

of the above 81 61.4 55 58.5 
On two or more of the above 38 28.8 55 58.5 
On three or more of the above 20 15.2 33 85.1 
On four or more of the above п 8.3 20 21.3 
On буе ór more of the above 0 0 13 13.6 


. Note.—Deviant findings 
poorest 16% of normal Ss. 


since significant group differences were 
obtained on these measures. On the other 
hand, if our hypothesis of greater con- 
sistency of poor performance in the anoxie 
group is true, then a greater proportion of 
normals with at least one deviant measure 
should be found. As it turned out, these 
tendencies nearly canceled each other in 
the present instance. 

As the number of measures on which de- 
viant scores occur increases, there is & 
rapid decrease in the proportion of nor- 
mals and a very slow decrease in the pro- 
portion of апохісѕ. The criterion of poor 
performance on five or more measures 
ineludes no normals and 13.6% of the 


were defined as a performance equal to or below the performance of the 
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groupe in terms of the number of indi- 

* viduals deviant on two or more measures 
are significant by chi-square test (р < 
401). All of the subsequent differences are 
also significant (p < .01). It would appear 
that, although many anoxies and normals 
may show evidence of deviant response by 
our criteria, the anoxics appear to be much 
more consistent in their poorer functioning 
than is true of the normals. 


Discussion 


The 7-year project was undertaken to 
answer several questions. Naturally, we 
were concerned with the extent of impair- 
ment in the anoxic group at 7 years of age. 

- I is clear that several types of minimal 
'dm & are evident. Our results are 
mesonably consistent with the hypothesis 
of a continuum of reproductive casualty. 
"The relationship of our results to those 
obtained in the 3-year study are of even 
greater interest. Some of the deficits remain 
essentially the same while others show 
changing trends. There are no longer any 
Significant differences in the number of 
signs of neurological deficit. However, this 
result is most likely attributable to the lack 
bility of the signs found at both ages. 
ferences in intelligence, while never 
great, are minimal in our children 
It would seem reasonable to suspect 
the Vocabulary differences actually 
et differences in abstract ability and 
t these may become more clearly differ- 
ited with increasing age. 
Some of the differences between verbal 
And perceptual functioning obtained at 3 
з of age led the previous investigators 
eculate about the relationship between 
of onset of trauma and type of deficit 
hart et al., 1963; Graham et al., 1962). 
(1949) had suggested that early 
io brain struetures would impair 
"abulary ability to a greater extent than 
uld damage occurring in later years. In 
to language, impairment in per- 
al and conceptual ability is marked 
nage occurring in adulthood. It is 
ned that a deficit would occur in 
е behavior if the damage occurs 
to the point at which overlearning has 
n place. 


Our data gave little support to this 
hypothesis. А deficit in perceptual-motor 
ability appeared at 7 years in our study 
while it did not at 3 years of age. Further- 
more, it was roughly equivalent to the defi- 
cit obtained in vocabulary. It may also be 
noted that а number of other measures of 
language ability were included on whieh no 
deficit occurred. It has been suggested that 
the vocabulary impairment actually re- 
flects impairment in abstract conceptual 
ability at age 7. This deficit might still 
exist in future years, at which time a 
standard vocabulary test might no longer 
detect it! It is tempting to speculate that a 
given type of deficit may be manifest. in 
different behaviors at different ages. 

Hebb's (1949) theorizing would also 
suggest that brain trauma occurring at 
birth would have greater effecta than later 
trauma since it might affect “early 
learning" Blemont and Birch (1960) 
demonstrated that brain-damaged children 
showed a greater deficit on а visual recon- 
struction task than did brain-damaged 
adults. However, other authors have ar- 
gued in favor of lesser rather than greater 
deficit as a function of early trauma 
(Kennard, 1942; Teuber & Rudel, 1962). 
It would seem just as reasonable to assume 
that learning during the period of greatest 
maturation could minimize the effects of 
early injury. Unfortunately, little decisive 
evidence is available which bears on this 
issue. 

There is recent evidence which lends 
some weight to the hypothesis that a deficit 
may be manifested in different ways at 
different ages. Teuber and Rudel (1962) 
have presented evidence that some be- 
havioral manifestations of deficit diminish 
with age, some increase with age, and still 
others are fairly stable throughout the 
entire period of childhood and adolescence. 
Tt is clear that the age at which a behavior 
trait is measured is a relevant variable. It is 
also reasonable to assume that some be- 
havioral deficits cannot be demonstrated 
until the child has developed sufficiently to 
show them. What remains, then, is the 
question as to whether a type of impair- 
ment, once it has appeared in behavior, will 
appear in different behaviors at different 
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ages, but still be identified as the same 
deficit. 

Another finding of importance to emerge 
from the project relates to personality 
functioning. Our results give little support 
to the hyperkinetie personality syndrome. 
While the examiner ratings of Distracti- 
bility and Impulsivity were significant, 
the differences were not great enough to be 
of clinieally predictive significance. 

Our most important finding appears to 
be in the area of social competence. The 
Vineland Social Maturity Scale was the 
"best" measure in the project in terms of 
group discrimination. It might be argued 
that the information obtained from this 
instrument was biased. The mothers of the 
anoxic children might have treated them 
differently, sheltered or overprotected them, 
because something was "wrong" at birth. 
While this possibility cannot be ruled out, 
it seems to be unlikely. For the most part, 
these children were functioning within 
normal limits. Our impression was that 
their parents reacted to them accordingly. 
Furthermore, many of the parents did not 
seem to be aware of any connection be- 
tween our study and the newborn condi- 
tion. Even if it were possible to demon- 
strate that differential treatment were the 
relevant variable, the finding would be of 
no less significance. Of course this issue 
demonstrates one of the problems which is 
inherent in a study such as this. One must 
always be very cautious in attributing 
causality where only relationships have 
been established. 

It would be of value to consider the other 
evidence from the study which lends sup- 
port to the finding of a deficit in social 
competence. The psychiatric ratings con- 
tribute additional evidence. The finding, 
along with the Communication rating, 
suggests that anoxies have verbal capacity, 
but may run into difficulties of expressing 
their thoughts and ideas to others. One 
might speculate that this finding hints at a 
minimal example of the phenomenon found 
in expressive aphasia. The significant find- 
ing with the Social Sensitivity rating on 
which more anoxies were found to be in- 
sensitive to interpersonal stimuli suggests 
a somewhat broader deficit than that 
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indicated by Communication. It may also 
be noted that the differences obtained on 
the examiner ratings of Impulsivity and 
Distractibility could contribute to the 
lower social competence of the anoxies, 

The psychophysiological data which were 
not included in this report produced one 
finding which offers some support in the 
area of personality functioning. Many 
children did not complete the psycho- 
physiological procedure either by a direct 
refusal to cooperate throughout or an in- 
ability to relax in the laboratory. A certain 
amount of fearfulness was noted in almost 
every case. The child was required to rest 
in a hospital bed during this procedure. 

Surprisingly enough, the reactions to the 
electrodes and wires were not nearly as 
marked as those to the hospital bed itself. 
The hospital setting in this regard appeared 
to be somewhat frightening to these Ss. The 
amount of prior experience (all of the pro- 
cedures were carried out in the hospital) 
had little effect. More anoxies than nor- 
mals (p « .05) did not complete this pro- 
cedure. This result is strikingly similar to 
those of Bolin (1959) who found more 
signs of fearfulness in children subjected to 
a long birth duration. In general, we would 
conclude that the deficits in personality 
functioning suggested by our findings war- 
rant careful future investigation. They are 
especially important because they differ 
from the usually hypothesized hyperkinetic 
personality syndrome. 

Again, the question of predictability of 
later deficits from newborn measures may 
be considered. We have already noted that 
such predictability is poor. Perhaps, some 
qualifications should be made. Even if the 
clinical criteria from the birth ratings were 
perfectly reliable, there would be no guaran- 
tee of corresponding amounts of cerebral 
deficit. However, if one were even willing to 
grant this possibility, there would still be 
the question of the relationship between 
degree of cerebral deficit and future behav- 
ioral deficit. 

If the range of cerebral deficits were 
sufficiently broad, we would probably expect 
substantial, but far from perfect, correla- 
tions of cerebral deficits with behavioral 
deficits. We have no clear evidence wit 
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regard to the degree of cerebral deficit of 
" our anoxic children. However, if there is 
one thing that can be inferred from our 
results, it is that such deficits for the group 
as a whole must be minimal. It is certainly 
reasonable to expect that the newborn 
measures would not demonstrate high rela- 
tionships with later performance under 
these circumstances. 

| It should be noted that the finding of 
minimal deficits in relation to perinatal 
anoxia is fully as important as the finding 
of any deficit at all. Of course, it is possible 
that some of those deficits now present will 
diminish while others will begin to appear 
as the children grow older (cf. Teuber & 
Rudel, 1962). Nevertheless, the deficits will 
' probably remain fairly minimal for our 
group. The differences between anoxies and 
normals on some of our discriminating 
measures were smaller than those which 
were attributable to sex- and race-status 
differences. Consequently, some caution 
must be employed in generalizing from these 
findings when it is possible for such effects 
io be relegated to secondary positions by 
other environmental forces. 

It would be of interest to compare our 
Ündings with those from other similar 
studies. However, follow-up studies which 
have dealt with measures other than intel- 
ligence and neurological deficit are few in 
number. Furthermore, our findings indi- 
tate that the age of the children must also 
be considered before any direct compari- 
‘ons may readily be made. There are 
three recent follow-up studies which would 
Appear to be fairly comparable in a num- 

T of respects to the present one. 

, Fraser and Wilks (1959), at the Univer- 
ty of Aberdeen, studied 100 children 
with asphyxia compared with normal соп- 
trols. Their criteria of asphyxia were 

| clinica] criteria with delay in breathing 
being the principal one. Most of their Ss 
Were followed at 7.5 years of age. They 

| assessed intelligence, personality, reading 

bility, perceptual functioning, and neuro- 

logical deficits. 

| Schachter and Apgar (1959), at Colum- 
a University, studied a group of 60 

| children with a history of perinatal com- 
Pueations and 96 normals when the chil- 


dren were about 8 years old. Their children 
had been selected on criteria very similar 
to those used in the St. Louis studies. They 
made extensive assessments of intelligence, 
perceptual functioning, abstract ability, 
and personality. 

Arenberg (1960), at Duke University, 
studied 66 children whose breathing was 
delayed at least 1-5 minutes and/or re- 
quired resuscitation. These Ss were 5-12 
years of age when they were followed up 
and were matched with a group of normals. 
Intelligence, motor development, and vis- 
ual-motor functioning were assessed. 

"The results of these studies are compared 
with those of the present study in Table 
29. It may be,noted that Arenberg (1960) 
found no significant differences between 
his groups. Of these four studies, only the 
Columbia University study obtained a 
significant differenee in IQ. The previous 
investigators (Graham et al., 1962) noted 
that where a significant difference in intel- 
ligence has been found, 

a deficit of five IQ points appears to be a rough 
approximation from the several studies [pp. 46-47]. 

А deficit in perceptual functioning was 
evident in the majority of these studies. 
The University of Aberdeen study is the 
only one to report positive neurological 


TABLE 29 
CoMPARISON OF SEVERAL STUDIES ON 
PERINATAL ANOXIA 


Study 
ү Aberdeen? Columbia? Duke? Sts, 

Age 7.5-11.5 8.4 5.0-12.0 7.0 
IQ 0 +4.8 0 
Reading 0 = - ? 
Vocabulary- 

abstract 

ability = ES = + 
Perceptual ue 2 0 + 
Motor = = 0 ? 
Neurological +5% - - 0 
Personality 0 0 = + 


a Presence of significant deficit is indicated 
by (+), questionable presence by (?), absence by 
(0), and omission of a measure by (—). 

bFraser and Wilks (1959). 

* Schachter and Apgar (1959). 

d berg (1960). 

е [o RR E aon Painter, Stern, and Thurs- 


ton (present study). 
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findings. Our study is the only one to find 
differences in the area of personality. To 
some degree this result may be a function 
of difference in emphasis in the measures 
used but the differences cannot be explained 
entirely on that basis. 

One lack in our study as well as a number 
of other studies was the measurement of 
motor abilities. It is hoped that future in- 
vestigations would consider this area in 
more detail. Our ratings indicate some 
deficit in this area. However, it is no more 
than suggestive. The present findings also 
suggest that a more fruitful approach to 
personality investigation in this area might 
make use of specific age-standardized be- 
haviors. It is conceivable that there is little 
possibility of going beyond a demopstra- 
tion of deficits in social competence in 
general After all, the cerebral deficits 
giving rise to such behavioral deficits may 
be somewhat "nonspecific" themselves 
(cf. Windle, 1960, 1963). However, we 
would hope that future studies may be 
able to delineate these deficits with greater 
specificity. 

Finally, we hope that the hypothesis 
with regard to a given deficit being mani- 
fested in different behaviors at different 
ages is suceptible to a clear test in the 
future. Our results are merely suggestive. 
However, this hypothesis could be relevant 
in adding to our ability to explain why 
investigators obtain different results from 
groups of different ages. 


SUMMARY 


The effects of perinatal anoxia were 
studied in a group of 7-year-old children 
who had also been examined at birth and 
3-years of age. The follow-up sample was 
composed of 134 Ss who were normal full- 
term newborns and 101 Ss who were anoxic 
full-term newborns. The 235 Ss repre- 
sented 85.5% of the selected sample which 
had been studed at 3 years of age. The 
anoxic group was composed of three sub- 
groups, those with signs of prenatal anoxia, 
those with postnatal apnea, and those with 
signs of both conditions. 

The follow-up was conducted without 
knowledge of the newborn classification 
and included assessment in the areas of 


cognitive and perceptual functioning, pes 
sonality and neurological impairment. Ab 
though analyses by anoxic subgroups 

in relation to newborn criteria of se 

of anoxia were made, primary concern 
with overall group differences. 

Differences between normals and a 
in intelligence which were significant ig 
the 3-year study were attenuated. Th 
anoxies no longer showed a significant 
deficit in intelligence. The only test of 
cognitive function on which anoxics dem 
onstrated a significant deficit was th 
Vocabulary subtest from the WISC. Thi 
result was interpreted in terms of a deficitis 
abstract ability. There was a marked 
tendency which became significant in the 
subgroup analyses for the anoxics to show 
impairment on a test of reading ability. 
However, many Ss from both groups wem 
unable to read at the time of examination. 

In contrast to the findings in the previous 
follow-up, the anoxics obtained signif 
cantly lower scores on a test of perceptual- 
motor functioning. They also tended to de 
poorly on a special test of perceptual at 
tention. No signifieant differences wem 
obtained on an embedded-figures test. 

Results from examiner, parent, teacher. 
and psychiatrie ratings produced some 
significant findings. The data suggest à 
significant impairment in the area of socis! 
competence for the anoxic group. There if 
little support in the data for the usually 
hypothesized hyperkinetic personality sym 
drome. 

While there was a significantly greater 
frequency of positive and suggestive neuro- 
logical signs in the anoxic group at 3 years 
of age, the difference was not significant 
in the present study although the same 
trend was present. Further analysis of 
these data revealed a lack of stability i? 
the neurological signs obtained in the tw 
follow-up studies. Consequently, no cleat 
interpretation of changes from the earliet 
to the later study could be offered. : 

The relationships of the measures with 
ratings of newborn complication and new 
born behavior test scores were very low OF 
insignificant. It was concluded that latet 
deficits could not be reliably predicteé 
from newborn measures. | 


, gAnalyns of the number of individuals in 
group who did poorly on several of 
@ measures in the study revealed that an 
number in both groups may show 
in some one area. However, anoxies 
who performed poorly in one area of funo- 
T tioning showed a greater consistency of 
| such performance over several areas than 
did the corresponding normals. 

И may be concluded that perinatal 
anoxia is related to deficits evident at 7 
of age. It is also evident that such 
leficits which do occur are reasonably 

| к for the group as а whole. 
was some tendency for the degree of 


nj nt to be associated with new- 
eriteria of severity, the association is 


one. А prognosis for any given 
mewborn would be very difficult to make 
wxeept when there are very severe or nu- 
merous complications. 
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This monograph reports 5 experiments relating partial reward effects and 
discrimination learning to a theory based upon a frustrative conception of 
nonreward. In Experiment 1 data are provided to demonstrate how frustra- 
tive nonreward is involved in discrimination learning. More particularly, 
evidence is provided to support a breakdown of discrimination learning into 
4 stages with respect to the operation of primary (Ry) and secondary (ry) 
frustrative effects. Experiment 2 shows that rate of discrimination learning 
depends on the number of prior continuous rewards in relation to both (sep- 
4 arately) of the discriminanda. Experiments 3 and 4 exatnine the effect on 
discrimination of different amounts and kinds of prediscrimination experi- 
ence with one of the discriminanda. Finally, Experiment 5 studies the ef- 
fect on discrimination performance of a prediscrimination condition in 
which one stimulus (S,+) is partially rewarded and the other (S++) con- 
tinuously rewarded. This prediscrimination condition is interesting in it- 
self as a within-S partial reinforcement experiment, and the results of the 
prediscrimination phase show agreement with the usual (between-S) PR 
experiment in which a separate group is run under each condition. 


ment extinction effects (PRE); (b) partial 


nu interpretation of partial reinforce- 
reinforcement acquisition effects, most nota- 


ment acquisition and extinction effects 


based on frustrative-nonreward premises 
(Amsel, 1958; Spence, 1960) and the extension 
of this theory to discrimination learning and 
prediserimination phenomena (Amsel, 1962) 
will be shown to provide the framework and 
the impetus for a series of experiments to be 
reported in this monograph. These experi- 
ments and their theoretical underpinning 
Constitute a step toward our ultimate goal— 
to bring together, under a unifying explana- 
tory system, some seemingly diverse be- 
havioral phenomena: (a) partial reinforce- 
= 


, | The research reported here and the prepara- 
Чоп of the report were supported by Grants 
013895 and GB143 from the National Science 
Foundation. The data of Experiments 1 and 2 
Were collected at Tulane University (Ward, 1962). 
senior author is responsible for Experiments 

3, 4, and 5, performed at the University of To- 
_ Tonto, and for the form of the present report; and 
е is pleased to acknowledge the assistance of Т. 
Tusec and E. Thomas in Experiment 3 and of 
Rochel Gelman, J. R. MacKinnon, M. E. Rashotte, 
and С. T, Surridge in Experiment 5. Experiment 
4 ова) ts data originally reported by Falehuck 
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bly higher acquisition asymptotes for some 
response measures but not others under 
partial reward conditions than under con- 
tinuous reward conditions after extended 
training; (c) stages, representing changing 
processes, in discrimination learning; and 
(d) the effects of prediscrimination training 
(prior experience with the discriminanda) 
on later discrimination learning. 

The plan of this report is to present, first, 
an outline of the form and substance of the 
theory and a discussion of its implications, 
particularly for partial-reward and dis- 
criminative phenomena. Following this in- 
troduction, five experiments will be reported, 
in each case preceded by an account of how 
the theory dictated the particular experi- 
ment. 

Conditioning-M odel Theory 

The conceptual form of our analysis de- 

rives from a portion of neobehavioristic 


theory of the Hull-Spence variety which 
combines elements of Pavlovian and Thorn- 
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Fio. 1. Schema of Pavlovian and Thorndikian 
learning showing how conditioning processes are 
involved in instrumental learnihg (after Spence, 
1956, p. 60). (In this figure and in Figure 2, dashed 
lines represent classically conditioned, learned 
connections; double lines represent strengthened 
instrumental connections; solid lines represent 
unlearned connections; and wiggly lines represent 
4 contingency relationship between the instrumen- 
tal response and the appearance of the goal stimu- 
lus in instrumental learning.) 


dikian conditioning. This type of theorizing, 
termed conditioning-model theory (Lach- 
man, 1960), emphasizes the role of classi- 
cally conditioned implicit responses in in- 
strumental learning. Here the role of goal 
events (S; and Ro) is all important, the 
major functions of the goal stimulus (So) 
being to serve as the unconditioned stimu- 
lus for a goal response (Ro) and to provide 
for the classical conditioning of cues ante- 
dating the goal (S,) to portions of Ro in 
instrumental learning. 

A schematie representation of these two 
kinds of simple learning and their interre- 
lationships is presented in Figure 1. The 
top portion of the figure shows, separately, 
classical (Pavlovian) and instrumental 
(Thorndikian) conditioning. Interchange- 
able symbols have been provided for the 
stimuli and responses of classical condition- 
ing: the conditioned response is also desig- 
nated re; the unconditioned stimulus, Se ; 
the unconditioned response, Re; and so on. 
Clearly, the schema or paradigm for Pavlo- 
vian conditioning applies also to the condi- 
tioning of re (the anticipatory goal response). 
"The bottom part of this schema shows how 
classical conditioning is involved in instru- 
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mental learning, and provides the cen 
of the classical conditioning model of m 
strumental behavior with which we will 
concerned. Represented are two 

in Phase I the classical conditioned re 

(ra) is formed as part of the instrumen 
sequence, and in Phase II the classical 
ditioned response once formed moves 
ward in time and, through its feedback 
stimulation (з), becomes part of the mee 
anism for the evocation of the instrumental 
response. 

The relationships shown in Figure 1 
based upon a general conception of 
events (Ro). In an earlier article (Ams 
1958), it was proposed that goal events wer 
of three kinds: rewards (Ry), punishments 
(Ry), and frustrations (Ry). To these three, 
following Mowrer (1960), a fourth kind 
relief, might be added. Such a classifica- 
tion of goal events would set relief and 
punishment into the same relationship a 
reward and frustration. We will deal only 
with the latter two members of such a four- 
fold classification; that is to say, with re 
ward-frustration factors in learning. How- 
ever, as Martin (1963) has shown, the kind 
of reasoning contained here and in earliet 
treatments of reward and frustration can 
also be applied to an analysis of punishment 
and relief. 

Figure 2 represents our assumptions as û 
the manner in which reward and frustratiyé 
nonreward are involved in simple instru 
mental learning. It shows, schematically, 
the (classical) conditioning of anticipatory 
reward (rz) and anticipatory frustration 
(rr) in instrumental behavior. Rewarded 
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Ета. 2. Schematic representation of the condi- 
tioning of anticipatory reward (rg) and the Де 
volvement of rg and nonreward (Snon-r) in t 
conditioning of anticipatory frustration (ғғ): 
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trials occasion. the development of rs (an- 
tieipatory reward) which moves forward in 
¢ temporal sequence to become part of the 
affecting the instrumental re- 
When 7, and its feedback stimula- 
%, affect the instrumental response, 
behavior can be said to involve incen- 
tive motivation. The right-hand side of 
Figure 2 indicates that when reward-in- 
 emtive motivation is operating and the goal 
іа а nonrewarding (Sn...) rather than a 
rewarding event, primary frustration (Ry) 
` Frustration is then defined as non- 
in the presence of rẹ. Anticipatory 
7 апа nonreward are both necessary, 
but not sufficient, conditions for frustration. 
Since R, operates as an unconditioned re- 
z^ to Snon-« , nonrewards will occasion 
€ conditioning of r, (right-hand side of 
2) to the cues of S, . This conditioned 
of anticipatory frustration moves 
in time or backward along the in- 
tal sequence to affect the instru- 
response, presumably in a manner 
‘Antagonistic to that in which rẹ affects the 
istrumental response. 


such as has been outlined, partial- 
eward learning and discrimination learning 
in be shown to involve similar processes. 
0 from an earlier discussion (Amsel, 


rtial reinforcement and discrimination learn- 
ig procedures are highly similar; in fact, they are 
most identical if we compare partial-reinforce- 
ent training to the early stages of discrimination 
"Mining with separate (successive) rather than 
int (simultaneous) presentation of stimuli. In 
һ, at the outset, [the subject] S is rewarded 
е occasions and not on others for the same 
umental response. The difference is that, in 
al-reinforcement experiments, [the experi- 
iter] E is training S to make the same response 
"every trial, whereas in discrimination learning 
"erent stimulation is involved when 5 is re- 
дей and not rewarded and S comes ultimately 
'espond (or not respond) selectively; but only, 
rule, after S has learned to respond non-se- 
TEM on the basis of partial reinforcement 


the theory with which we have been 
'orking divides partial-reward acquisition 
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or discrimination learning into four stages. 
Each of these four stages is assumed to in- 
volve processes quite different from the 
others, especially in regard to the operation 
of the conditioned anticipatory response 
factors depicted in Figure 2. For expository 
Purposes it will be useful to outline some of 
the statements of the theory in relation to 
the double-runway appamtus used in a 
variety of studies in our laboratory and 
others for a number of years (Amsel, 1962). 
This is, essentially, two runways in series 

a start box, a first runway, a first goal! box, 
а second runway, and a second goal box 
the goal box of the first runway being also 
the start box for the second. (Experiment 1 
provides a detailed description of this type 
of apparatus) In а typical experiment 
(Amsel & Roussel, 1952) an S is at first 
trained to run from the start box to the first 
goal box for food, then out of the first goal 
box to the second goal box for a further 
food reward. Then, in a series of test trials, 
S finds no food (is frustrated) in the first 
goal box on some trials (S is always rewarded 
in the terminal goal box), and the effects of 
this frustration are observed as changes in 
vigor of responses in the runways leading to 
and away from the frustrating event. By 
means of such an apparatus it is possible to 
separate and measure two responses, each 
reflecting a property of frustration: (a) the 
frustrated response and changes in speed 
indicating anticipatory (secondary) frustra- 
tion (ry), measured in Runway 1; and (b) 
the frustration-motivated response in Run- 
way 2 and increases in its strength indica- 
tive of the magnitude of primary frustration 
(Rr) in Goal 1 (GJ). Such increases in vigor 
of responding in Runway 2 following non- 
reward in G; are instances of the frustration 
effect (FE), and they serve as indicants of 
the strength of the primary, immediate 
frustrative reaction to nonreward. How- 
ever, the theory is mainly concerned with 
the conditioned form of such frustrative 
reactions and the manner in which they 
come to affect behavior in partial-reward 
learning and in discrimination learning 
oceurring in Runway 1 of the double run- 
way or in any (simpler) single runway or 
other instrumental response situation. 
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The sequence of four hypotheses under 
consideration is schematized in Figure 3. 
It differentiates four stages of practice in a 
situation in which, for example, Runway 1 
performance is partially rewarded in G; and 
FE is measured in Runway 2; or а situation 
in which S learns a discrimination on the 
basis of Runway 1 cues, while FE develops 
and is measured in Runway 2. Each of these 
cases will be detailed separately. 


Theory Applied to Partial Reward and Some 
Implications 

In Stage 1 of partial-reward training 
(see Figure 3) reward trials operate to effect 
the conditioning of re; nonreward trials 
cannot effect any significant amount of 
frustration until re develops in strength. In 
Stage 2 when re is already strong and is a 
factor in the evocation of the instrumental 
response, the occurrence of nonreward re- 
sults in primary frustration (Ry). During 
this second stage there is also the beginning 
of a buildup of ry , conditioned on the basis 
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STAGES OF LEARNING 
bl 
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of primary frustration (R,) as the шиш. 
ditioned response. In Stage 3 r, М 
acts r,—anticipatory frustration-p 
avoidance and anticipatory reward- 
ing approach—and the result is 
Finally, in Stage 4 of partial-reward 
tion, whieh is reached if partial. 
acquisition is earried on long enough, 
anticipatory frustration-produced cues 
come to evoke avoidance as well as ap 
proach. This conditioning of cues (sy) ж 
naling nonreward to continued approach 
the very important mechanism that we 
with in these experiments. It is a mechani 
of persistence; the kind of persistence 
shows up as greater resistance to extineti 
following partial- than following continuous 
reward acquisition (e. g. Longstreth, 1964) 
Obviously, the crucial feature of this ki 
of theory is the role of the anticipatory, 
frustration-produced stimuli (s+) which d 
pend in turn on the conditionability @ 
anticipatory frustration responses. There it 
now some experimental support for the со 
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Fic. 3. Diagramed sequence of hypotheses relating frustrative nonreward to stages of partial 


reward and discrimination learning. 


F 


, eeptualization of rr. Wagner (1963a) has 


shown that cues associated with frustrative 


| monreward can serve to energize the startle 


of rats in a stabilimeter and that 
reduction can reinforce hurdle-erossing 
behavior. Such a finding implies that г, has 
the properties of a drive. Amsel and Sur- 
ridge (1964) report that the introduction 
into an alley of cues previously associated 
with frustrative nonreward in the goalbox 
results in a sudden attenuation of approach 
responses. They point out the similarity of 
this finding to the results of experiments on 
the “conditioned emotional response" where 
the aversiveness of conditioned fear is meas- 
ured in terms of its capacity to reduce the 


 Birength of an instrumental response. In our 


terminology the suggestion from these ex- 

periments is that r, , the conditioned form of 

Rr , has properties similar to r, , the condi- 
form of pain. 

The predictions that can be made depend 
entirely on what response the feedback 
stimuli (s+) from r, evoke. At the level of 

3 they are assumed to evoke avoidance 
and should hasten extinetion. At 
4 they evoke the criterion approach 
-Fesponse, or responses compatible with the 
Criterion response, and should retard ex- 
tinction. Anticipatory frustration-produced 
imuli can therefore provide the mechanism 
г persistence or for rapid abandonment of 
Н ses, depending on prior experience. 
In more general terms this analysis says 
hat extinction in instrumental learning is 


Partly, at least, a frustration phenomenon 
апа that the tendency to continue respond- 
in the face of negative indications is 
ired under partial-reinforeement con- 
s when cues from anticipatory frus- 
tion become connected to continued 
Approach. In terms of such a conceptualiza- 
tion, PRE does not have to mean that par- 
y reinforced Ss fail to discriminate 
quisition from extinction; it may reflect 
5 g to approach rather than to avoid 
in the presence of cues signaling nonreward. 
Ti are, in addition to these rather 
eral considerations, sme specific it 
mental implications o analysis. For 
xample, (a) at Stage 3 of partial-reward 
"quisition (the “conflict” stage), behavior 
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should be more variable than it was at Stage 
2 or will be in Stage 4; and (b) if PR sequi» 
sition does not reach Stage 4, there should be 
no PRE effect, That із to say, if training 
is discontinued when PR acquisition be- 
havior ix still variable, then PR acquisition 
should be followed by deereased rather than 
by increased resistance to extinction. Both 


the use of drugs. If just 
drug could be found, one that 


implicate decreased intensity of anticipa- 


Still another implication of the frustration 
theory as it applies to PRE has been tested 
in an experiment recently reported by Ross 
(1964). The reasoning is somewhat as fol- 
lows: assuming that extinction involves 
rr—sp , rate of extinction (or degree of per- 
sistence) should depend on the nature of 
responses already conditioned to s». If an 
organism carries into extinction response 
associations with s, that are compatible 
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with the extinguishing response, this should 
retard the extinction of that response. Con- 
versely, if responses incompatible with the 
extinguishing response are mediated by sr, 
extinction should be accelerated. This rea- 
soning suggests that rate of extinction fol- 
lowing continuous-reward acquisition will 
be determined, at least in part, by the his- 
tory of S and particularly by the kinds of 
(approach) responses that have been asso- 
ciated with sy in that S. In the Ross experi- 
ment a history of partial- (as compared with 
continuous-) reward acquisition was estab- 
lished to three different responses (running, 
jumping across a gap, and climbing) in a 
boxlike apparatus (Apparatus A) in the first 
phase of a three-phase experiment. In the 
second phase all Ss learned a running re- 
sponse to continuous reward in a long run- 
way (Apparatus B). In the third phase the 
running response was extinguished for all 
Ss in Apparatus B. In line with the earlier 
reasoning, if the histories of partial reward 
for running, jumping, and climbing built 
persistence into these activities in Phase 1 
of the experiment, it should be possible to 
show that Ss trained under partial reward 
in Phase 1 to the running response, or the 
response more compatible with running (in 
this case, jumping a horizontal gap), would 
be more resistant to extinction in Phase 3 
than those Ss trained in Phase 1 to persist 
in a response less compatible with running 
(in this ease, climbing): persistence in climb- 
ing to the sp cue (the mechanism for PRE), 
learned in Phase 1, does not transfer to run- 
ning in Phase 3 extinction. The results of 
the Ross experiment indicated a clear differ- 
ence in extinction of running as between 
animals with “running-partial” or “jump- 
ing-partial” histories, on the one hand, and 
those with “climbing-partial” histories, on 
the other. It is important to add that such 
differences were not found among the groups 
continuously rewarded for running, climb- 
ing, and jumping in Phase 1. This kind of 
research indieates how an S-R conditioning 
model ean provide a basis for understand- 
ing mediated transfer of persistence and 
fixation as an emotional motivational phe- 
nomenon. 


Theory Applied to Discrimination Learning 


In discrimination learning, Stages 1-3 are 
conceptualized as being virtually identical 
to those of partial-reward training. The 
schema of Figure 3 suggests that it is only af 
the fourth stage that discrimination learning, 
involves mechanisms that are different from 
those in partial-reward acquisition. We are 
considering here diserimination learning of 
a certain kind, to be sure: discrimination 
learning with single and successive presen- 
tation of stimuli rather than discrimination 
learning with simultaneous presentation of 
stimuli. This is not to say that with simul- 
taneous presentation of two or more dis- 
criminanda these mechanisms would not 
operate, but they would be much more diffi- 
cult to untangle. At the beginning of a dis- 
crimination procedure involving separate 
and successive presentation of S+ and 8-, 
the two discriminanda are not differentiated. 
Consequently, as far as response evocation 
is concerned, they are the same stimulus. 
They operate in the same manner as S, 0n 
the partial-reward side until Stage 4, when 
one begins to evoke anticipatory frustra- 
tion; the other, anticipatory reward. 

In short, the difference between the two 
situations depicted in Figure 3 is that, 
unlike diserimination learning, the partial-re- 
ward situation affords no basis for differ- 
ential responding to stimuli. In PR ac 
quisition the same response is sometimes 
rewarded and sometimes not in the presence 
of the same general pattern of stimulation. 
In discrimination learning a response origi- 
nally nondifferential in the presence of two 
physically different stimuli ultimately be 
comes differential in relation to the two stim- 
uli, on the basis of differential reward and 
nonreward to 8+ and 8—. The suggestion 
has been that in diserimination learning 9 
thissort one of the factors involved is differ- 
ential evocation by S— and S+ of rr and te 
and that since these processes also affect P 
acquisition, implications of some importance 
follow, although few of these have yet been 
reported. For example, the analysis implies 
that the frustrative reaction to nonrewares 
as indicated by FE in Runway 2 of a double 
runway, should diminish or even disappet 


і 


Frustration AND RESISTANCE TO DISCRIMINATION 7 


once a discrimination is formed on the basis 
of differential stimuli in Runway 1. The 
reasoning here is that diserimination implies 
the evocation of re by S+ and of ry by S— 
as a terminal state of affairs. However, by 
definition, frustration must be preceded by 
гк (anticipatory reward is a necessary ante- 
cedent to Ry and FE); therefore, when the 
discrimination has been learned and S— 
evokes г» rather than rg, the frustrative 
reaction to nonreward should be attenuated. 
The experimental portion of this report will 
deal with this and other implications of the 
discrimination analysis, but first we must 
proceed one step further in the reasoning 
and develop the notion that under some 
circumstances of discrimination learning, 
failure over some prolonged period of train- 
ing to respond differentially to positive and 
negative discriminanda may not mean that 
these stimuli are in fact equivalent per- 
ceptually, but that S has learned to approach 
in the presence of the negative discrimi- 
nandum as well as in the presence of the 
positive one. 


Prediscrimination Exposure to Discriminanda 

Before proceeding, let us go back to Figure 
3 and summarize. PR acquisition and dis- 
crimination learning involve similar proc- 
esses in the first three of the four stages. In 
Stage 1 r,—s, develops in Runway 1 as a 
function of early rewards in Gi, while Run- 
way 2 performance would indicate that 
honreward is not yet frustrating. In Stage 
2 the evocation of rz—s,z in the instrumental 
Sequence antedating nonreward makes non- 
reward frustrating (Rp), and the same stim- 
uli that evoke rg in Runway 1 now come to 
evoke rp as well. Stage 3 is а stage of con- 
flict between response tendencies associated 
With s, and sp. In PR acquisition it is a 
Stage just preceding the conditioning of s; 
to approach (the mechanism for PRE); in 

iscrimination learning, Stage 3 just pre- 
cedes evidence of differential responding to 
S+ and S—. By Stage 4 the mechanisms 
for persistence and differentiation have been 
Strengthened in PR acquisition and dis- 
crimination learning, respectively. In PR, 
Te and rp cannot be elicited by differential 


cues; consequently, s» comes to elicit ap- 
proach, providing the mechanism for per- 
sistence and PRE. Discrimination results 
when a basis exists for differentiation, that 
is, 8+ — ress — Rapp, while S— — rp- 
8p — Ray . 

Not long ago, an extension of the frustra- 
tion theory was proposed to handle a variety 
of circumstances in which learning of a dis- 
crimination is preceded by partially or 
continuously rewarded exposure to one or 
more of the eventual discriminanda (Amsel, 
1962). These preexposures to the discrimi- 
nanda were termed prediscrimination ex- 
periences, and the discussion indicated that 
they could take a number of different forms. 
In a discrimination involving two cues, it is 
possible, for example, to expose S, prior to 
discrimination, to one of the eventual dis- 
criminanda or to both of them. If a single 
discriminandum is presented, it may be the 
one that will be S+ or S— in the later dis- 
crimination, and it may be presented in 
relation to either partial or continuous re- 
ward for an approach response. The nature 
of the prediscrimination experience will pre- 
sumably affect the rate of a subsequent 
discrimination. If, for example, prediscrimi- 
nation experience is with the stimulus to be 
negative (S:—) in the discrimination 
(Si: +8.—), and the prediscrimination reward 
to this stimulus is partial (S2++), a condition 
akin to the PRE paradigm exists if Saz 
training has proceeded to Stage 4, but not if 
it has gone only to Stage 3. On the other 
hand, our position leads to the prediction 
that continuously rewarded preexposure to 
Ss of an Sj--8»— discrimination should re- 
sult in more rapid discrimination the greater 
the number of such S;4- trials preceding 
diserimination training. 

It is clear that our conceptualization of 
processes involved in discrimination learning 
and in partial-reward learning may be of 
some usefulness in understanding the trans- 
fer effects that result when a variety of 
prediserimination experiences precede the 
learning of discriminations. The experi- 
mental portion of this monograph includes 
a series of experiments designed to test im- 
plications of the prediserimination analysis. 
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The specific reasoning involved in deducing 
each experimental situation from the broader 
outline of the theory will be left to the intro- 
duetion of each separate experiment. What 
follows is a more general statement of the 
overall purpose of the study along with a 
brief statement about each of the five ex- 
periments that follow. 


Purpose of the Experiments 


When some kind of prediscrimination 
exposure to one or both of the discriminanda 
precedes their presentation as S+ and S— 
in а successive discrimination, this preex- 
posure may either hasten or retard the de- 
velopment of differential responding to the 
two stimuli. Such retardation, relative to 
some appropriate control condition, results 
from. the transfer to the subsequent. dis- 
crimination learning of mechanisms ac- 
quired in the earlier prediserimination treat- 
ment. It is plausible that these mechanisms 
are the same as those transferred from ac- 
quisition to extinction in PRE, and that in 
both situations they embrace much of what 
is ordinarily meant by persistence. In this 
Sense, then, persistence refers to (a) a 
learned tendency to continue to respond in 
the face of negative (nongoal) indications 
and (b) a learned retardation of diserimina- 
tion evidenced by relative failure to respond 
differentially to two (or more) stimuli, one 
signaling reward and the other(s) nonreward. 

Following some period of preexposure to 
one or both.of a pair of discriminanda, dis- 
crimination learning may be retarded. The- 
oretically, a situation then exists in which 
the first three stages of discrimination (see 
Figure 3) are seemingly extended and the 
appearance of Stage 4 is delayed. We have 
argued that the main factor in this retarda- 
tion is that S—, which must evoke avoidance 
through the mediating mechanism of rp — 
Sr , at first elicits approach as а consequence 
of its prediserimination exposure in connec- 
tion with partial-reward training. 

А variety of experimental approaches may 
be employed for Studying prediserimination 
effects, and some of these are reported in the 
following pages. Experiment 1 provides a 
demonstration of the involvement of frustra- 
tion in discrimination learning. This experi- 


ment lends plausibility and support to th 
four-stage hypothesis, outlined earlier, asj 
applies to discrimination learning wit 
separate (successive) presentation of stim: 
uli. The basic set of relationships that hold 
between discriminative performance ang 
level of frustration, demonstrated in Ex 
periment 1, provides encouragement to tesi 
the hypotheses concerning prediscriming. 
tion effects (Amsel, 1962). Some of these 
tests are carried out in Experiments 2, 3,4, 
and 5. 

In Experiment 2, Ss are exposed to pre 
discrimination reward in relation to both 
discriminanda (Si--S;4-), the variable being 
the number of such exposures before leam- 
ing an Sj--S,— discrimination. This pro- 
vides a test of the hypothesis that, with 
initial positive tendencies to the two dis 
criminanda equal, rate of discrimination 
learning is a positive function of strength 
of rr to the negative stimulus (So—) in the 
discrimination. 

Experiment 3 provides evidence that (a) 
discriminative extinction (and discrimina- 
tion) is faster after 32 S,2- exposures than 
after 120 such exposures, confirming the 
operation of PRE in failure to discriminate 
(discriminative persistence); (b) after 32 
partial rewards to Sı, discriminative ex 
tinction is faster when Sı is the negative 
discriminandum than when a new stimulus 
(82) is negative, while the reverse is true after 
120 Sit trials; and (c) retardation of dis- 
crimination is most drastic when the pre 
discrimination treatment is SS, раг 
tial-reward preexposure to both of the 
eventual discriminanda, 

The results of Experiment 4 relate most 
meaningfully to the third stage of our рах 
tial-reward hypothesis (see Figure 3). The 
preliminary portion of this experiment 18 à 
PR condition run one trial а day with à 
large reward. It demonstrates clearly the 
conflict stage hypothesized to occur in Р. 
acquisition. г 

Finally, Experiment 5 contains a predis: 
crimination condition (SitS.+) in which 
there is either a large or small number © 
PR trials in relation to one stimulus and С 
trials in relation to another. Apart from the 


implications of such conditions for subse- + 
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quent discrimination learning, the predis- 
crimination condition itself proves very 
interesting in that it reveals PR acquisition 
effects within Ss—higher asymptotic per- 
formance to Sı than to 8,+ in starting 
and running (alley) times, but lower asymp- 
totic performance to 81+ than to 8,4- in 
goal times. This is a result previously only 
demonstrated as а between-groups effect. 
This within-Ss, partial-reward experiment 
also suggests new research directions, nota- 
bly, the possibility of studying in the same 
organism the relationships among the be- 
havioral dimensions, vigor, persistence, and 
choice. The coneluding discussion describes 
the characteristics and the implieations of 
such a program of research. 


ExPERIMENT 1: FRUSTRATIVE Factors IN 
DISCRIMINATION LEARNING 


It has been suggested that any successful 
analysis of discrimination learning must 
include a consideration of the active effects 
of nonrewarded trials as well as the effect 
of rewards. There is even some evidence to 
suggest that nonrewarded training may con- 
tribute relatively more toward learning а 
discrimination problem than rewarded train- 
ing (Amsel, 1962). 

The Hullian analysis of discrimination 
learning (Hull, 1950, 1952), which has been 
and continues to be an influential factor 
in S-R thinking, underemphasized the role 
of nonreward. While Hull did maintain 
that the primary process involved was 
differential reinforcement, he held that its 
major funetion was to (a) neutralize the 
effect of certain stimuli that occur in the 
Presence of both discriminanda, (b) in- 
Crease the power of the positive stimulus 
to evoke the (approach) response, and (c) 
decrease the power of the negative stimulus 
to evoke the response. This decrease in 
Power of the negative stimulus was not 
Tegarded as having to do with learning of 
active avoidance due to nonreward; rather, 
Hull postulated an inhibitory state, reactive 
Inhibition, which was assumed to develop 
With each reaction. This reactive inhibition 
Was, in Hull’s conceptualization, offset by 
the growth of excitatory strength to the 
Positive stimulus on rewarded trials. Non- 


rewarded trials permitted an accumulation 
of reactive inhibition which was not offset 
by the growth of excitatory strength and 
which led to extinction. Hull assumed that 
the dissipation of 7, satisfied the criterion 
of drive reduction and led to the develop- 
ment of a negative habit, conditioned in- 
hibition, which accounted for more per- 
manent extinctive effects. Amsel (1958) 
questioned the development of differential 
inhibition based on responding in discrimi- 
nation learning and pointed out, 


There is nothing in the Hullian system which 
suggests that Ip is in any way related to rein- 
forcement or nonreinforcement ; consequently, if 
S'R is to be employed in the explanation of dis- 
erimination learning, one would have to hold that 
it develops to both S+ and S— [p. 110]. 


The further point was made that the im- 
portant differential factors in discrimination 
learning are the positive and negative goal 
events which determine positive and nega- 
tive excitatory tendencies; that work, in 
and of itself, affects both tendencies equally 
and can not be regarded as a differential 
factor. 

Amsel's position on discrimination learn- 
ing was in part a return to an earlier theoreti- 
cal treatment (Spence, 1936), which held 
the two major principles of discrimination 
to be reinforcement and inhibition. Ac- 
cording to Spence, the principle of rein- 
forcement 


assumes that if a reaction is followed by [a] re- 
ward ... the excitatory tendencies of the imme- 
diate stimulus components are reinforced or 
strengthened by a certain increment, “7” [p. 430]; 


and the principle of inhibition or frustration 
states 


that when a reaction is not rewarded . . . the ex- 
citatory tendencies of the active stimulus com- 
ponents are weakened by a certain decrement, 
**D." Tt assumes that this weakening is due to an 
active, negative process, inhibition, which, add- 
ing itself in algebraic fashion to the positive ex- 
citatory tendencies, results in lowered strength 
values [p. 430]. 


Spence also made the point that 


the weakening effect on an S - ۰. R connection of 
failure of reward... varies directly with the 
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strength of the response, being greater for strong 
ones than for weak ones [p. 431]. 


Amsel's (1958) analysis of discrimination 
learning specifies that r,-s; and r,-s& are 
the mechanisms behind the positive and 
negative tendencies of Spence. His basic 
position is as follows: 


(a) that under certain conditions nonreward is an 
active faetor which may be termed frustrative 
nonreward; (b) that such frustrative events are 
antecedents to а primary, aversive, motivational 
condition, frustration; and (c) that a secondary 
(learned) form of this primary aversive condition, 
termed fractional anticipatory frustration (rg— 
8r), develops through classical conditioning and 
is the inhibitory mechanism in nonreward. The 
position will be taken that frustrative-nonreward 
events determine activating (drive) effects, which 
can be measured as an increase in the vigor of 
behavior which immediately follows the frustra- 
tive events, and are also responsible for inhibi- 
tory effects, which are at least partly responsible 
for decreases in strength of the instrumental be- 
havior which is terminated by the frustrative 
event [p. 103]. 


Later in the analysis, he specifically re- 
lates frustration to discrimination: 


the positive and negative goal events determine 
positive and negative excitatory tendencies. ... 
If we assume that on nonreinforced trials there is 
present, in the goal region, a frustration reaction, 
it is possible to identify the negative excitatory 
factor conditioned to S—: since the negative dis- 
criminandum is present immediately preceding 
the nonreinforcement, its trace becomes condi. 
tioned to the frustration reaction (S—#rp). This 
anticipatory (antedating) frustration response 
would then be evoked by S—, and the response- 
produced stimulus (зк) would become connected 
to not responding [p. 111]. 


The demonstration that nonreward has 
frustrative effects is, of course, necessary 
before a theory based on such effects can 
be of much value. An experiment by Amsel 
and Roussel (1952) introduced the use of 
the double-runway apparatus to test for 
frustrative effects of nonreward. Using this 
apparatus, a number of experiments, re- 
cently reviewed (Amsel, 1962), have ac- 
cumulated preponderantly favorable evi- 
dence; and the even more recent experiments 
(MeHose, 1963; McHose & Ludvigson, 
1964; MacKinnon & Amsel, 1964) support 
this conclusion. The importance of non- 


reward as an active factor in discrimination 
learning has been suggested in several ex. 
perimental studies (e.g., Fitzwater, 1952; 
Grice & Goldman, 1955; Grove & Eninger, 
1952; Shoemaker, 1953). 

If the analysis of the role of frustrative 
nonreward in discrimination learning has 
merit, it should be possible to show that 
FE, measured in Runway 2 of the tandem. | 
runway apparatus, bears certain relation- 
ships to various stages of discrimination 
learning measured in Runway 1. Mor 
specifically, evidence of nonreward-related 
frustration measured in Runway 2 should 
precede any evidence that Ss are running 
faster to S+ and slower to S— in Runway 
1; and, at some point following the ap- 
pearance of discriminative behavior in 
Runway 1, a reduction in the magnitude of 
FE, measured in Runway 2, should become 
apparent. This experiment was designed to 
investigate these two implications for dis- 
crimination learning of a theory of frustra- 
tive nonreward. 

The first implication—that in a successive 
discrimination involving approach and 
avoidance changes to positive and negative 
stimuli, the onset of discrimination will be 
evident only after nonreward becomes frus- 
trating—is based on the assumption that 
only when nonreward becomes frustrating 
does S— elicit the avoidance responses 
necessary for discrimination learning. The 
second implication derives from the assump- 
tion that learning a discrimination involves 
the separate elicitation of rp and rr by 
the positive and negative stimuli. If frus- 
tration is dependent on r;—s;, FE should 
diminish once a discrimination is formed 
and S— no longer elicits strong re, bu 
elicits ^r. The experiment which follows 
involves comparisons between indexes 0 
FE derived for the periods immediately 
before and after clear-cut evidence of dis 
crimination learning. 

A simple black-white discrimination was 
utilized. These black and white stimuli 
were presented successively in Runway 
in a predetermined order and were associate 
with reward or nonreward in Gi, after 
which Ss traversed a second runway ® 
receive reward in С». Discrimination lea? (* 
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ing was also studied in a single runway (Run- 
way 1 and Gı of the double runway) 
providing a parallel discrimination uncon- 
taminated by any possible effects of the 
second runway. 


Method 


8s and Apparatus 


Ss were 30 albino rats, 15 males and 15 females, 
approximately 120 days old. There were 20 Ss in 
the double-runway group and 10 in the single- 
runway control. 

The apparatus was the L shaped double runway 
(Amsel & Penick, 1962) with slight modifications. 
In series were a start box, Runway 1, Goal 1 (Gi), 
Runway 2, and Goal 2 (Gs). The start box was 
approximately 12 inches long, runways were ap- 
proximately 60 inches long, and the goal boxes 
were approximately 23 inches long. All were 31$ 
inches wide and 4 inches high. 

The apparatus was painted white and inside 
walls were lined with 1¢-inch Plexiglas, The wall 
color was changed by inserting painted sections 
of sheet metal. The floor in the start box, Run- 
way 1, and С. was black or white rubber matting 
mounted on Masonite. The floor in Runway 2 
and С» consisted of }¢-inch brass bars spaced 14 
inch apart. The top of the apparatus consisted of 
32-inch Plexiglas, hinged to allow easy access to 
the runways. The walls and floor in the first sec- 
tion of the apparatus were either black or white 
according to a predetermined order. The walls in 
Runway 2 and Gs were alternating black and 
white vertical stripes. A small metal feeding tray, 
painted the color of the goal box, was suspended 
from the side wall of G; approximately 2 inches 
from the end. A black and white striped feeding 
tray, flush with the floor, held the reward in Goal 
Box 2. Illumination was provided by 40-watt 
fluorescent strips suspended approximately 22 
inches above the apparatus. 

There were six drop doors in the apparatus, 
three in each section. The start box had two doors: 
an orienting door of galvanized iron and a start 
door of clear Plexiglas which allowed access to 
Runway 1. There was a retrace door of galvanized 
iron between Runway 1 and Gi. This door re- 
mained open until the animal entered С. and was 
then closed. The same sequence of doors was used 
in the second section of the apparatus, when Gi 
became the start box for Runway 2. 

Performance measures were taken by Standard 
Electric timers activated by photoelectric sys- 
tems. The entire timing apparatus was covered 
With soundproofing material to reduce audible 
Cues from the operation of the timing circuit. The 
Sequence of starting and stopping the clocks in 
the first section of the apparatus was as follows. 
(а) Dropping the Plexiglas start door in the start 
box closed a microswiteh which started Timer 1. 
b) when S moved to a point 3 inches into Runway 


1, Timer 1 stopped, providing a starting time 
measure, and Timer 2 started. (c) Passage 54 
inches down Runway 1 stopped Timer 2, providing 
а measure of running time, and started Timer 3. 
(d) Entrance midway into G, stopped Timer 3, 
providing a measure of goal time, and activated 
a Hunter timer. Following a 25-second period, the 
timer activated a solenoid which dropped the 
metal orienting door in С, and reset the equip- 
ment to obtain the same sequence of time meas- 
urements in the second section of the apparatus. 


Procedure 


Preliminary Training. Ss were placed on a dep- 
rivation schedule at least 15 days prior to the 
beginning of the experiment, and this schedule was 
continued throughout the experimental training. 
Ss were housed in individual cages and fed 9 
grams of laboratory chow approximately 15 min- 
utes after each*day's handling or experimental 
trials, One hour later the remaining food was re- 
moved. The hunger drive was maintained at 22 
hours of food deprivation. Water was available at 
all times in the home cage. During this period, Ss 
were handled daily in groups of 10 by Æ. 

There followed a period of 5 days during which 
Ss were allowed to explore the alleys for 5 min- 
utes daily. During this period, the photoelectric 
system was operating to allow Ss to become ac- 
customed to the mild noises of the system. 

Experimental Period. Ss were run in groups of 
10 with а minimum of 10 minutes between suc- 
cessive trials. Each S was given six trials per day 
for 30 days with the stimulus cues of the first sec- 
tion of the apparatus varied in a predetermined 
order. S was placed in the start box and the ori- 
enting door was dropped after 3 seconds. Two 
seconds later the start door was dropped, allowing 
S to traverse the runway to G; to receive about .1 
gram of food in two pellets. S was confined in С, 
for 25 seconds on each trial, during which time 
the pellets were consumed on reward trials. At 
the end of this period the orienting door was 
dropped automatically. Two seconds later the 
start door was dropped, allowing S to traverse the 
runway to Gs and receive a single pellet of food 
(about .05 gram). S was removed from С» after 
25 seconds and placed in a carrying cage to wait 
for the next trial. The color of the walls and floor 
of Runway 1 was varied in the following manner 
for all groups: WBBWWB, WWBBWB, BWBWBW, 
BWWBBW, BBWWBW, WBWBWB. For half the 
Ss of each group, black was positive and white 
negative; for the other half, the stimulus eues 


were reversed. 


Results and Discussion 


All 30 Ss learned the discrimination prob- 
lem as indicated by a consistent separation 
of the Runway 1 performance measures to 
the positive and negative cues. These in- 
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dividual data are not shown; however, 
Figures 4 and 5 show the averaged results 
graphically. Figure 4 presents the discrimi- 
nation and FE results for the double-run- 
way group. The discrimination curves 
(bottom panel) indicate the development 
of differential responding in Runway 1 to 
the discriminanda for the running-time 
measure. The top panel shows the parallel 
development of FE in Runway 2 for the 
same measure. Other measures (goal-entry 
time and starting time in both runways) 
show essentially the same relationships (see 
Ward, 1962). 

At the start of training, the predominant 
response in Runway 1 was approach to the 
goal. Both curves show a decrease in response 
time over the first 6 days (36 trials). At 
this stage, where performance is strong to the 
as-yet-undifferentiated stimuli, the top 
panel of Figure 4 indicates a consistent FE, 
Le. performance in Runway 2 is more 
vigorous following nonreward than following 
reward. Now there appears, in the lower 
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Fie. 4. The relation of magnitude of FE (top 
panel) to stages of the discrimination (bottom 
panel) in the double-runway apparatus. 
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Fig. 5. The discrimination (compare with 
bottom panel of Figure 4) in the single-runway 
control condition. 


panel, the separation characteristic of dis- | 
crimination learning: a gradual further 
decrease in the response times to the positive 
stimulus and а pronounced increase to the 
negative stimulus. Following the stage of 
strongest discrimination in Runway 1, there 
is evidence of disappearance of FE, at 
least temporarily. These relationships have 
now been demonstrated in several experi- 
ments (see Experiment 2, for example). . 
The fact that all Ss learned the diserimi- 
nation indicates that continuous reward in 
С» of Runway 2 performance does not mask 
the effect of discrimination in Runway 1. 
Nonetheless, Figure 5 presents a clear picture 
of diserimination learning in Runway ! 
alone without these contaminating effects 
of Runway 2. By comparing these curves 
with those of Figure 4 (bottom panel), the 
effect of the use of the second runway OF 
diserimination сап be evaluated. It seems 
apparent that (a) the presence of the second 
runway does seem to attenuate the diserimi- 
nation difference somewhat and (b) the 
return toward the base line of the negative 
curve occurs in the single- as well as in the 
double-runway condition, although it 0% 
curs somewhat earlier in the latter case. 
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In order to test our hypotheses concerning 
the magnitude of FE in the pre- and post- 
discrimination phases, two indexes of frus- 
tration were computed for each S of the 
double-runway group from Runway 2 meas- 
ures: one for a period of 5 days preceding 
discrimination, the other for 5 days follow- 
ing. 

Taking the point of discrimination to be 
Day 13, the indexes of FE for each S of 
the double-runway group were based on 5- 
day periods: Days 8-12 and Days 14-18. 
Over these 5-day periods, the median run- 
ning time following nonreward was sub- 
tracted from the median running time 
following reward for each S. These differ- 
ences yielded positive or negative FE indexes 
for the two 5-day periods. The before and 
after FE scores are shown in Table 1 for 
running time, and the mean FEs are 9.45 
and 2.65, respectively. The before period 
shows 4 reversals of FE out of 20; the after 
period shows 8. In order to test our hypothe- 
ses about presence and absence of FE before 
and immediately following discrimination, 
these two distributions of scores should be 
tested against a hypothetical distribution 
with a mean FE index of zero. Such a pro- 


TABLE 1 
INpexEes or FE Basen on RUNNING TIME IN 
Runway 2 
Discrimination 
М Веѓоге After 
1 14 25 
2 5 —30 
3 12 19 
4 21 10 
5 25 m 
6 45 29 
7 22 18 
8 9 2 
9 12 —10 
10 21 0 
п 6 m 
12 -2 6 
13 14 13 
14 10 2 
15 8 TP 
16 -4 0 
17 3 i 
18 —34 —16 
19 —7 3 
20 9 8 


cedure yields £ = 2.46 for FE before dis- 
crimination, which is significant between 
the .01 and .05 levels. The FE after dis- 
crimination is not statistically significant 
(t = 84). 

Experiment 1 provided an important test 
of frustrative nonreward theory as it ap- 
plies to simple diserimination learning, 
showing the relationship of FE measured 
in Runway 2 to discrimination learning 
developing in Runway 1. It demonstrated 
that, in а double runway, discrimination 
learning in Runway 1 could clearly һе 
identified and was not masked by the effect 
of continuous reward in Runway 2. In this 
sense, the experiment inereased our confi- 
dence in the relative independence of the 
phenomena in Runways 1 and 2, suggesting 
that what could be shown in a single run- 
way would also manifest itself in the first 
runway of a double-runway apparatus, pro- 
vided the stimulus properties of the two 
segments were sufficiently different. 


EXPERIMENT 2: DISCRIMINATION LEARNING 
AS A FUNCTION OF THE NUMBER OF 
PREDISCRIMINATION REWARDS 


One of the implications of frustration 
theory is that, with increasing strength of 
ra , there should be greater immediate effects 
of nonreward (FE) and, consequently, 
greater conditioned (“inhibiting”) effects of 
nonreward (ry). As we indicated earlier, 
experiments on FE are now numerous and 
go well beyond the simple demonstration 
of the effect. Although this is not the place 
to review such studies in detail, or exhaus- 
tively, it seems clear that nonreward pro- 
duces a relatively immediate FE following 
a prolonged period of continuous reward 
(Amsel & Roussel, 1952; Wagner, 1959); 
that FE develops gradually when the Run- 
way 1 response is partially rewarded from 
the outset іп С, (Amsel & Hancock, 1957; 
Wagner, 1959); that continuous nonreward 
in С, after FE has already been established 
will result in the disappearance of the effect 
(MeHose, 1963); and that FE is greater 
when conditions for eliciting rz are, by 
definition, better (Amsel, Ernhard, & Gal- 
brecht, 1961; Amsel & Hancock, 1957). 

Such earlier experiments on FE and the 
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results of the first experiment of this report 
suggest that, if the initial strengths of the 
positive tendencies to two discriminanda 
are equal, rate of discrimination learning 
should be a positive function of the strength 
of ra to the negative stimulus. That is, the 
stronger the evocation of anticipatory re- 
ward by the negative stimulus (S—) in a 
diserimination at the outset, the faster 
should discrimination be because stronger 
elicitation of rp leads to greater frustration 
when nonreward occurs and leads to faster 
conditioning of ть and, therefore, to faster 
discrimination. 

There is already in the literature an 
experiment which shows that (a) nonrewards 
seem more important than«rewards in the 
formation of a discrimination and (b) the 
effectiveness of a nonreward is directly 
related to number of prior rewards. This 
experiment by Shoemaker (1953), which 
seems particularly relevant as an introduc- 
tion to Experiment 2 and the subsequent 
experiments on prediscrimination effects, 
investigated the relative effectiveness of 
rewarded and nonrewarded trials in a black- 
white discrimination. Groups of rats were 
first subjected to three levels of prediscrimi- 
nation training in which they received 4, 12, 
or 24 rewarded runs to both of the discrimi. 
nanda. There followed a second phase of the 
experiment in which Ss experienced varying 
numbers of rewards and nonrewards to the 
black and white stimuli (separately). In 
the final phase of the experiment they were 
“choice” tested with both alleys (black 
and white) available. The indication was 
that final discrimination performance was 
better the larger the number of rewarded 
or nonrewarded trials in the second phase. 
However, the effect of a given number of 
nonrewarded trials on later discrimination 
was greater than the effect of that number 
of rewarded trials. Also, the effect of non- 
rewards on later discrimination performance 
was directly related to the number of pre- 
liminary rewarded runs (4, 12, or 24) to 
both of the eventual discriminanda. 

In Experiment 2 the relationship between 
nonreward (frustrative) effects, observed 
independently in Runway 2 in the form of 
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FE, and discrimination learning, indicata 
by changes in running speed to Si+ 
Ss— in Runway 1, were examined followi 
varying numbers of continuous rewards 
both stimuli (8,4-5,4) in a prelimin 
phase of the experiment. The Me 
was: the larger the number of prediscrim 
nation rewards, the earlier the appearang 
of FE in Runway 2 and of differentia 
responding to S; and 5, in Runway 1 whet 
Ss is made negative (signals nonreward), 


Method 


Ss and Apparatus 


Ss were 40 albino rats approximately 120 da; 
old. They were randomly assigned to three groupe 
"Twenty Ss were assigned to Group 0 and 10% 
each of Groups 12 and 482. Half the Ss in each 
group were male and half female. 

The double-runway apparatus described in Ex 
periment 1 was employed. 


Procedure 


The general procedure during preliminary 
training and the experimental period were exactly 
as described in Experiment 1. 1 

Experimental Manipulation. The hypothesis 
was investigated by the use of three groups it 
which the number of prediserimination reward 
to the two discriminanda (black and white) wi 
varied. In Group 0 the discrimination problem 
was introduced at the start of training. Group й 
Ss received 12 prediscrimination rewards in Gii 
6 when Runway 1 was black (B+) and 6 when it 
was white (W+). Group 48 received 24 rewards il 
relation to each color of Runway 1. In every cast, 
Ss also ran in Runway 2 following reward in б. 

The discrimination training was exactly as i 
Experiment 1. It involved running to black and 
to white stimuli in Runway 1 which were vari 
in a predetermined order, as in Experiment 1, à 
rewarded or nonrewarded in terms of whether 
they were positive or negative. After reward 0 
nonreward in G; , Ss traversed Runway 2 to 1 
ceive reward in G; . Discrimination learning M 
evidenced by the separation of performance meas 
ures in Runway 1, i.e., by approach (shorter T 
ning time) to the positive stimulus and avoidant 
(longer running time) to the negative stimulus 
The performance measures were running and g0 ] 
times in Runways 1 and 2. Again, Runway 1 n 
ures permitted the observation of discriminate? 
learning, and Runway 2 measures permitted eva 
uation of FE in relation to this discrimination: 


7 
2 The 20 Ss of Group 20 are the same a8 thos 
of the double-runway group of Experiment 1. 
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Results and Discussion 


The hypothesis that the three groups 
learned the discrimination at different rates 
as a result of varying numbers of predis- 
crimination rewards was tested by analysis 
of variance. Two measures of performance, 
running and goal times, were used in de- 
termining the onset of discrimination learn- 
ing for individual Ss of the three groups. 
Two criteria of discrimination, an “easy” 
and a “hard,” were defined as follows. 
Easy—the first five out of seven separations 
of the reward and nonreward daily medians 
in Runway 1. The two nonseparations could 
be either ties or reversals. Hard—the first 
nine separations of the reward and nonre- 
ward daily medians with no reversals. 

The running-time data for onset of dis- 
crimination for the easy and hard criteria 
are presented in Table 2. The goal-entry 
data for the easy and hard criteria are 
presented in Table 3. 

Analyses of variance indicate that the 
discrimination problem was learned faster 


TABLE 2 
NUMBER or Days то тне Easy (E) AND Harp 
(Н) CRITERIA or DISCRIMINATION FOR THE 
RUNNING-TIME MEASURE 


Group 

5 0 12 48 

E H E H E H 
1 i2 m 9 13 iem 
2 TA. 5 9 1" ^m 
3 zd gi. жо dedo 
4 12 12 qe 
5 4 [9 2 т ENG 
6 8. 1 12 15 9 9 
7 155148 83 11 3 PES 
8 8 14 10 15 5 8 
9 E Mua doi 
10 9 13 peo em 3 14 
11 вй 
12 5 15 
13 il 4 
14 1 19 
15 Fries 
16 12- 12 
17 їз ia 
18 SUNG 
19 7 М2 
20 Sal 
M 8.0 12.8 6.5 10.3 4.5 8.6 


TABLE 3 


Nomper or Days то тик Easy (E) AND Hanp 
(Н) CRITERIA OF DISCRIMINATION FOR THE 
Goar-EwTRY MEASURE 


Group 
s о п as 
E H E H E H 

1 8 10 1 3 9 9 
2 9 9 9 9 1 10 
3 10 12 7 12 1 1 
4 13 15 7 12 8 8 
5 13 16 3 9 3 6 
6 17 9 2 5 7 7 
7 14 21 4 6 3 3 
8 8 18 1 12 5 7 
9 11 13 4 8 1 7 
10 NEUE E pn rg rr 
11 8 8 

19.05 2 M 

13 4 7 

14 2 10 

15 10 10 

16 6 8 

17 1 9 

18 9 9 

19 2 8 

20 1 4 
M 7.2 10.9 (4&3 83 41 6.1 


by the groups given more prediscrimination 
rewarded trials to the stimuli to be used as 
the diseriminanda. All the analyses yielded 
significant Fs with the exception of the one 
based on the running-time measure and the 
easy criterion. This Ё was 3.22 approaching 
significance at the .05 level of confidence 
(required F = 3.25, df = 2, p < .05). The 
goal entry differences with the easy criterion 
was significant beyond the .05 level (F = 
3.64). The differences in the case of both 
measures and the hard criterion were sig- 
nifieant beyond the .01 level (Fs — 5.26 
and 6.14 for running and goal times, re- 
spectively). 

The results of an analysis of the Runway 
2 measures are not presented in detail 
beeause they show essentially the same 
picture as in Experiment 1 (see Ward, 1962). 
In Groups 12 and 48 (Group 0 is the double- 
runway condition of Experiment 1) the 
findings are similar to those presented in 
Table 1: there is a clear and significant FE 
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in the Runway 2 curves preceding discrimi- 
nation and a marked decline to nonsignifi- 
cance in FE following discrimination. In the 
case of Group 48, FE appears immediately 
as Ss are switched to discrimination; in 
Group 12, FE appears on the fourth dis- 
crimination day. 

The results of the present analysis show 
a direct relationship between number of 
prediscrimination rewards to the two dis- 
criminanda and rate of discrimination 
learning. In this regard they are in accord 
with the earlier result of Shoemaker (1953), 
who found improvement of discrimination 
performance as number of prediscrimination 
rewards increased in a single-runway ap- 
paratus. ; 

Earlier still; Spence (1937) had shown 
that discrimination learning was dependent 
upon the excitatory strengths of S+ and 
S— in a situation where the excitatory 
strengths were determined by the number of 
rewards and nonrewards received in earlier 
discrimination problems. The present ex- 
periment investigated a refinement of this 
in that excitatory strength, as defined by 
the number of prediscrimination rewards, 
was the same to the two stimuli but varied 
for the different groups. The effects of 
nonreinforcement were greater (as indicated 
by faster diserimination learning) in groups 
receiving larger numbers of prediscrimina- 
tion rewarded trials. 

Spence, Goodrich, and Ross (1959), in a 
more recent study, found that the decre- 
mental effect of nonreinforcement was much 
greater in a high-drive group (40-hour food 
deprivation) than in a low-drive group (3- 
hour food deprivation), They noted that 
the strength of the response to the negative 
stimulus increased as fast as that to the 
positive stimulus during the early trials, 
The assumption was made that the dis- 
crimination occurs when interfering re- 
sponses were cued to internal stimuli (sr) 
which were aroused by anticipatory frus- 
tration reactions (rr). These frustration 
responses would not develop until the animal 
first learned to expect food. The frustration 
effect might be a function of the drive level 
and strength of the expectation. Their 
finding of a greater decrement due to non- 
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reinforcement in the high-drive than in the 
low-drive group would tend to support this 
assumption. 


| 
Predictions from the Theory of Pre- | 
discrimination Effects | 


In Experiment 2, Ss were exposed before 
a discrimination to both of the discriminan 
and approach responses in the presence @ 
both were always rewarded. We have gone 
on to consider more complicated cases ip 
which discrimination learning is preceded, 
not only by continuous-reward experience, 
but also by partial-reward experience in 
relation to one or to both of a pair of dis. 
criminanda (Amsel, 1962). Experiments 3, 
4, and 5 deal with this sort of problem and 
test predictions, developed out of the frus 
tration analysis, about the course of 4 
black-white discrimination depending on 
some history of continuous- or of partial. 
reward experience in relation to one or both 
of the eventual discriminanda. Again, dis- 
crimination is conceptualized in terms of 
approach and avoidance and is measured 
in terms of amplitude (running time) changes 
to the positive and negative discriminanda. 

Table 4 presents several possible kinds of 
prediscrimination experience with diserimi- 
nanda in relation to a B--W— diserimina- 
tion. Those shown above the dashed line 
were the subject of an earlier theoretical 


TABLE 4 
KiNps or PREDISCRIMINATION EXPERIENCE IN 
RELATION TO А Brack (+)-Wurrs (—) 
DISCRIMINATION 


One prediscrimination stimulus 
B+ Continuous reward, positive stimulus 
W+ Continuous reward, negative stimulus 
B+ Partial reward, positive stimulus 
W Partial reward, negative stimulus 


Two prediscrimination stimuli 1 
B+W-+ Continuous reward, both stimuli 
B-Wz Partial reward, both stimuli 
B—W-+- Discrimination reversal b. 
B+W-+ Partial reward, positive stimulus 4 

Continuous reward, negative stimu 


us 

B+W Partial reward, negative айаш 
Continuous reward, positive sti 
lus 
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Fic. 6. Schema showing prediscrimination — 
discrimination sequences when the prediscrimina- 
tion treatment is governed by three attributes, 
and exposure is to one of the eventual discrimi- 
nanda. 


analysis (Amsel, 1962) and are investigated 
in Experiments 3 and 4; those below the 
dashed line were not considered in the 
earlier analysis, and their specific implica- 
tions will be discussed in connection with 
Experiment 5. 

Figure 6 is a schema containing three 
factors presumed to be important in this 
research. It depicts, in terms of both a 
Specific (B,W) and a more general (9,5) 
notation, a class of situations in which only 
one of the two eventual discriminanda is 
presented prior to discrimination learning: 
(a) the prediscrimination stimulus may 
become either the positive or the negative 
Stimulus in the eventual discrimination; 
(P) the response to this prediscrimination 
Stimulus may be partially or continuously 
rewarded; (c) the number of prediscrimina- 
Чоп trials may be large, small, or of some 
intermediate value. Figure 6 shows the 
Various possible combinations of a, b, and 
с. The 12 cells of this schema represent à 
matrix of possible experimental conditions, 
each counterbalanced for absolute effects 
of color, for the case of a black-white dis- 
crimination. In terms of a behavior theory 
ey 

° The theoretical analysis presented in this 
Section follows substantially the one developed 
Ш the earlier report. 


with frustrative nonreward premises, how 
should these prediserimination exposures to 
a single discriminandum (stimulus) affect 
a subsequent black-white discrimination? 


Partial Prediscrimination Reward in Relation 
to One Stimulus 


Our analysis of the effects on discrimina- 
tion of prior partial reward of responses to 
one of the discriminanda involves the con- 
ception that anticipatory frustration-pro- 
duced stimuli (s+) become associated with 
approach responses in partial-reward train- 
ing, which means that, at some stage in 
partial-reward training, s» will become part 
of a stimulus complex evoking approach 
and earlier evidence of confliet will disap- 
pear. 

Of the four stages in partial-reward ac- 
quisition outlined earlier, three are theoreti- 
cally differentiable with respect to the 
involvement of rp-sy : Stage 1, when r, has 
been conditioned, but not yet rp ; Stage 3, 
when both гк and ғ» have been conditioned 
and their stimuli, s; and sy , evoke competing 
(approach and avoidance) tendencies; and 
Stage 4, when both r and ry have been 
conditioned and their stimuli, sz and Sr, 
both evoke (approach) tendencies. We have 
guessed that these stages correspond to 
10, 32, and 120 trials, respectively, in our 
current experiments. 

An assumption from the stimulus-response 
analysis of conflict and displacement (Miller, 
1944, 1948) for which there is strong ex- 
perimental support (see Miller, 1959) is 
that the generalization gradient for positive 
(approach) tendencies is flatter than that 
for negative (avoidance) tendencies. If we 
combine this assumption with the differen- 
tiated stages as they involve rz and т», 
and identify s; — app (which results from 
тк) and sp — av (which results from rp) 
as the positive and negative tendencies in 
Miller's type of analysis, the three panels of 
Figure 7 represent, from top to bottom, 
respectively, the state of affairs at the 
beginning of a black-white discrimination 
following a hypothetical 10, 32, or 120 
prediserimination partial rewards of a re- 
sponse to the preexposed stimulus, Si. 
(Throughout this section, when we refer 
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DISCRIMINATION STIMULUS 
Fic. 7. Strength of response tendencies to Sı 
and 5 at the beginning of diserimination training 
following prediscrimination training in which ap- 


proach to S; has been partially rewarded (S,-+). 
to the prediscrimination exposure of a single 
stimulus in relation to an 8-8, discrimina- 
tion, Sı will be the preexposed stimulus, 
and $, will be the new stimulus in the 
discrimination.) Represented on the left- 
hand side of Figure 7 are the relative mag- 
nitudes of approach and avoidance tend- 
encies to the prediscrimination stimulus 
(Sı) through the mediating response-pro- 
duced stimuli, s; and sp ‚ 2% the start of the 
diserimination. Shown on the right are the 
generalized strengths of these tendencies to 
the stimulus (Ss) which is new in the dis- 
crimination. 

A simplifying assumption, represented in 
the bottom panel of Figure 7, is that the 
strength of approach elicited through rp > 
sr by Sı and S; in the late stage is directly 
related to the strength of avoidance tend- 
encies elicited through rz — sp in the middle 
stage. The suggestion is that the avoidance 
tendency to sr is, at least partly, offset by 
an approach tendency at this stage. 

We are now in a position to make some 
predictions about the conditions in Figure 
6. These predictions will depend crucially 
on the depth dimension: whether predis- 
crimination training was extended to 10, 
32, or 120 trials. Remember that in our ex- 
ample, the prediserimination stimulus is al- 
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ways 8; and the discrimination is either Sit 
S:— or Sı—S:+. Outlined for each case of 
prediscrimination partial reward are pre- 
dietions as to which diserimination will be 
learned faster: that in which the predis- 
crimination stimulus remains positive in 
the discrimination (8, — Si+S.—) or 
that in which it becomes negative (Sy 4 
Sı—S:+). 


In the case of 120 prediscrimination partial 


rewards, the reasoning (see Figure 7) is] 


that, at the start of discrimination, S, will 
elicit rz and ry strongly and about equally. 
On the other hand, S», which is new in the 
discrimination, will elicit rp relatively 
strongly because of the flat positive gra- 
dient, but will elicit rp very weakly because 
of the steep negative gradient. Where 8, 
becomes negative in the discrimination af- 
ter a great many prediscrimination trials, 
Si— will elicit strong rp — sp , and sp will 
be associated with continued approach. This 
will slow discrimination relative to the con- 
dition where S, remains positive in the 
discrimination. Here the $, stimulus, new 
in the diserimination, elicits very weak 
Tr. Consequently, there is little зу condi- 
tioned to approach, there is little to counter- 
act the build-up of avoidance to S.—, and 
discrimination should develop more quickly. 
The theory predicts the following: after a 
very large number of prediscrimination partial 
rewards of an approach response to S1, 
Sı+S2— will produce faster discrimination 
than will S,—S2+. This prediction and its 
relations to other predictions are shown in 
Figure 8. 


The most interesting and critical compari- | 


son to be made for purposes of our analysis 
is between the 120-prediserimination-trial 
case and the intermediate, 32-prediserimina- 
tion-trial ease. In the former case, 8р 18 
already connected to approach, while, in 
the latter, sp still elicits avoidance. Con- 
sequently the prediction is reversed from 
one case to the other. The reasoning for the 
32-trial case is that both rp and rp are being 
evoked by S; at the start of discrimination, 
but sp elicits approach and s, still elicits 
avoidance. When, after 32 trials, a switch 
is made from prediscrimination 8, to an 
814-8, — discrimination, $j-- evokes both 
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Fic. 8. Predicted relative rate of discrimina- 
tion learning following various kinds of predis- 
crimination exposure to one or both of the even- 
tual diseriminanda. 


approach and avoidance tendencies through 
Sr and sy and 85— , new in the discrimination, 
evokes mainly generalized approach (гк) 
and little avoidance (rr). When the shift 
is from 8, to Sı—S:+, there is a strong 
tendency for sp to elicit avoidance in rela- 
tion to the now negative Si— (generalizing 
very little to S.+) and a strong tendency 
for s, to elicit approach to the new, and 
now positive, S+. This latter situation is 
much more favorable for discrimination. 
The prediction: after an intermediate number 
of partial rewards of an approach, response to 
Si, Sı+8S2— will produce slower discrimina- 
lion than will S,—So+. 

After few enough Sz trials to produce 
Some 7, but little or no FE and no rr, 
there should be no difference in the rate of 
the §:+8.— and $:—S2+ discriminations. 


Continuous Prediscrimination Reward in Re- 
lation to One Stimulus 


We deal here with the right side of Figure 
6: continuously rewarded prediscrimination 


experience with S, followed by a discrimina- 
tion in which S, remains positive or becomes 
negative. The predictions here are most 
interesting in relation to those of the pre- 
ceding section. Briefly, with increasing 
numbers of rewards of the approach response 
to the prediscrimination stimulus, 8,4- 
should evoke increasing re and nonreward 
in the subsequent discrimination should be 
increasingly frustrating. Since re generalizes 
strongly from S, to Ss, both should evoke 
rg quite strongly in the discrimination, 
FE being somewhat greater when S, be- 
comes the negative stimulus in the dis- 
erimination than when it stays positive. 
However, in either case the flat gradient 
of generalizatión of positive tendencies leads 
to a'predietion which corresponds to the 
findings of our Experiment 2 and to the 
Shoemaker results. In both cases, increasing 
the number of continuous prediscrimination 
rewards, within the limits of the values 
employed, produced faster discrimination. 

The difference between the two curves, 
shown in the top panel of Figure 8, is 
consistently in favor of faster discrimination 
for Sit — Sı+S:— in contrast to the 
middle panel. This is not really a prediction, 
but is suggested by some unpublished data. 
'The details are given in the earlier treat- 
ment (Amsel, 1962) and will not be included 
here since they have no bearing on the pres- 
ent development. 


Partial Prediscrimination Reward in Relation 
to Both Stimuli 


The bottom panel of Figure 8 summarizes 
predictions about the rate of discrimination 
learning following prediscrimination (par- 
tial- or continuous-reward) experience with 
both discriminanda presented separately. 
It is clear that Sı1+8:+ — Sı+S2— operates 
essentially like Sit — SitS.— (except 
that it is not susceptible to generalization 
effects, since neither B nor W is new in the 
discrimination) and is even more specifically 
the case of our Experiment 2 and the Shoe- 
maker experiment. 

The prediction concerning S12-8»2- as а 
prediscrimination condition indicates a de- 
creasing monotonic relationship between 
rate of discrimination and number of such 
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prediserimination trials. The reasoning is 
simply that partial reward of approach 
responses to both S; and $, for 32 trials will 
result in conflicting approach and avoidance 
response tendencies to both; while partial 
reward of this sort for 120 trials will result 
in approach responses to sp in connection 
with both. While conflict to Sı and 8» repre- 
sents a diffieult base on which to build 
discrimination, persistence in responding to 
Sı and 8; (PRE in relation to both discrimi- 
nanda) is an even less readily reversible 
condition. Hence the prediction depicted 
in the bottom panel of Figure 8: that the 
effect of increasing numbers of prediseri- 
mination trials will be reversed from 8.+ 
Sr to SitS.+. Y 

EXPERIMENT 3: EFFECTS OF NUMBER AND 

KINDS or PnEDISCRIMINATION. EXPERI- 
ENCES ON DISCRIMINATION LEARNING 


The discrimination procedure in Experi- 
ment 3 involves single presentation of 
stimuli (blaek or white) in Runway 1 with, 
say, black negative (B—) and white positive 
(W+). Table 5 presents a summary of the 
experimental conditions and numbers of 
Ss in each condition. It also provides the 
notation introduced in Figure 6 and em- 
ployed in describing this and the remaining 
experiments and reveals two major sets 
of comparisons made in this experiment in 
the seven experimental conditions (not con- 
sidering counterbalancing for color in the 
horizontal comparison). 
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The horizontal comparison in Table $ 
tests the prediction that, after relatively few 
(in this case 32) prediserimination exposures 
to Sit, the S:+8.— discrimination will 
require more trials than the $,—S.+ dis. 
crimination but that after many (120) 
prediscrimination exposures to S,--, the 
opposite will be the case, and Si -8,- 
will be learned faster than 8,—8.4-. 

The vertieal comparison in Table 5 in 
volves only conditions in which there have 
been a large number (120) of prediscrimina- 
tion exposures to one or both of the dis 
criminanda. These comparisons test predic- 
tions concerning kind of prediscrimination 
experience rather than number of such ex- 
periences: that partial-reward exposure {0 
both diseriminanda (8.8) will produce 
greatest retardation of eventual discrimi- 
nation (either Sj--S4— or S:—S»-+); while 
continuous reward exposure to only one, 
which then becomes the positive stimulus 
(Si+ — Sı + S:—), will yield fastest dis- 
crimination (see top panel of Figure 8 and 
relevant text on p. 19). The predicted order 
of resistance to discrimination (extinction) 
of the five conditions of the vertical com- 
parison in Table 5 is 7, 4, 3, 5, 6. 


Method 
Ss and Apparatus 


Ss were 79 male hooded rats, approximately 
100 days old at the beginning of the experiment. 
The apparatus was the same as in the first two 
experiments, 


TABLE 5 
Summary or EXPERIMENTAL CONDITIONS 
N 32 prediscrimination trials N 120 prediscrimination trials 
8 W+ > W+B- 8 W+ ¬ W+B- Е, 
8 B+ > B+W- Sit > Si+S.— [1] 8 Bi В+ Sit ¬ Si 8s [3] 
8 WŁ > W—B+ 8 W4 > W-B+ 
RR ES B-W4- Sit > S:-S.+ [2] 9 Be há B-W+ Sit > Sı—S:+ И] 
4 VE WEBS §.458,4+5.— [5] 
4 W+ > W-B+ ^84 > 8-S4r [6] 


7 Bt — W+B— SiS 808: (0 
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Procedure 


The experiment was in three stages: adjust- 
ment, prediserimination training, and discrimi- 
nation training. 

Adjustment. At least 15 days prior to the first 
prediscrimination trial, Ss were placed on a food- 
deprivation schedule and handled daily. They 
were maintained on 10 grams of laboratory ehow 
throughout the experiment. On Days 15-17, Ss 
were allowed to explore the apparatus for 5 min- 
utes daily with all doors and the photoelectric 
system operating to adjust Ss to the mild noises, 
ete. 

Prediscrimination. Ss were run 4 trials a day 
to a total of either 32 or 120 trials under one of 
the three conditions indicated earlier, i.e., Sj-F, 
Six, or Sı+S:+. The intertrial interval was at 
least 15 minutes. In the 8+ and Si2- conditions 
approximately half the Ss were run to white, the 
other half to black. The sequence of rewards (4-) 
and nonrewards (—) for the Sit groups was 
t-t-, -+-+ -—++-, +--+, = 
++ —-—, for each cycle of 6 days. The 6-day se- 
quence of colors and reward-nonreward for the 
5185: condition was B—-W-B--W-r, W+B+- 
W—B-, W+B—B+W-, B-—W+W-B+, B— - 
B+W+W-—, W--W—B- B4. 

Discrimination. The procedure here was the 
same as in prediscrimination, except that during 
this period of 36 days (4 trials/day) the sequence 
of rewards and nonrewards was +—+—, к= 
++, +--+, ——++, ++——, with the 
appropriate stimulus (B or W) always paired with 
reward or nonreward. 

On all trials, S was placed in the start box and 
а metal orienting door was dropped (opened) 
after 3 seconds. Two seconds later, the plastic 
start door opened allowing S to traverse Runway 
1 to С, where, on reward trials, three pellets total- 
ing about .1 gram were available. 8 was confined 
in С for 25 seconds on all trials, after which the 
9paque orienting door was automatically opened, 
followed in 2 seconds by the opening of the other 
(transparent) plastie start door. S now ran into 
Runway 2 and to Gs where it found a further .1 
gram of food and was confined for 25 seconds be- 
fore removal to the carrying cage to await the 
next trial. 


Results and Discussion 


In reporting the results of this experiment, 
the following order of presentation has 
been adopted. We present, first, results. of 
the horizontal comparison in Table 5, which 
evaluates the effect of number of predis- 
crimination exposures on discrimination, 
and, second, results of the vertical compari- 
Son which deals with kind of prediserimina- 
tion exposure. These findings will be pre- 
Sented graphically in terms of the general 


notation, which means, for example, com- 
bining the data of the B+ ~ B+W— 
condition with those of the W+ — W+B- 
condition, since these are the same except 
for any absolute effects of color. 

A more detailed analysis of all aspects 
of the experiment will then be offered. This 
will relate the discrimination (Runway 1) 
test results to frustration effects in the 
test phase, measured in Runway 2, and 
also to prediscrimination performance in 
Runways 1 and 2. It will thus be possible 
to evaluate various aspects of our ap- 
proach to discrimination learning and 
partial reward phenomena and their rela- 
tionships to the effects of frustrative non- 
reward. This more intensive analysis will 
be based on only a portion of the data, 
those from the two B+ prediscrimination 
conditions and the B+W+ condition, 
which were conducted at the same time 
and showed the effects more clearly than 
did the W= conditions. 

Analysis of Discrimination Data in Terms 
of the General Notation. Figures 9 and 10 
show discrimination learning in terms of 
extinction to the negative stimulus (S—), 
which is either the same as the prediscrimina- 
tion stimulus (Si—) or is new in the dis- 
crimination (S.—). In order not to compli- 
cate the presentation, performance to the 
positive stimulus (Si4- or 8;4-) is not shown. 

When nonreward is introduced in G, 
of the double runway, as it is on S— trials 
of discrimination, performance in Runway 1 
weakens (extinguishes), but then strengthens 
again because of the terminal reward in G, 
(see Experiment 1). This phenomenon, 
which is also discernible in single runways, 
seems to be accentuated by the double- 
runway situation. There are, therefore, two 
indieations of extinction rate to S— in the 
discrimination: (a) cycle (or block of trials) 
on which the extinction curve reaches its 
highest point (longest time), and (b) height 
(time to run) of curve at its highest point. 
In the diserimination period, each 6-day 
cycle represents a 24-trial block. 

Figure 9 shows the effect of number of 
prior Sı+ trials on discrimination when $, 
becomes negative (Si—) or when the new 
stimulus (S2—) is negative. It seems clear 
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6-DAY CYCLES 


Fic. 9. Diseriminative extinction as a fune- 
tion of number of prediscrimination trials and 
whether the prediscrimination stimulus (Sit) is 
positive (S2—) or negative (Si—) in the discrimi- 
nation. 


from the graphs, first of all, that discrimina- 
tive extinction is slower following 120 S,+ 
trials than it is following 32 Si+ trials. 
The interaction predicted earlier (Amsel, 
1962) also seems evident. After 32 Sit 
trials the discrimination (weakening of 
response to S—) is faster for Sı— than for 
S:—. The reverse is the case after 120 
Sit trials. This interaction becomes even 
more obvious when the difference between 
32 Sit — Sı— and 120 8+ > Si- is 
compared with the difference between 32 
Sit — 8,— and 120 Sj > S,—. The first 
difference is of the order 4 seconds at the 
highest points of the two curves; the other 
is about 1.5 seconds. 

In order to perform a statistical evalua- 
tion of the differences in Figure 9, 2-day 
medians were computed for each S of 
performance to S+ and S— in discrimina- 
tion. For each S, a count was made of 
number of 2-day blocks to a criterion of 
discrimination. The criterion was the first 
-2-second separation of goal times to S+ 
and S— (S+ being the smaller) followed 
by no subsequent reversals, i.e., no subse- 
quent blocks on which the S— time was 


— --— 9 


smaller than the S+. Table 6 provides the 
resulting blocks-to-criterion values for each 
of the four groups of Figure 9. These values 
were employed in a series of sign tests 
With an overall median of 12, a hypothesi 
of no difference in the four columns of 
Table 5 could be rejected at the .001 levd 
х? = 20.62, df = 3). Subsequent sign tests 
indicated that the 120 S,+ groups dis- 
criminated more slowly than the 32 Sit 
groups (x? = 15.30, df = 3, p < 001); 
that, after 32 S,+ trials, discriminative 
extinction is faster to Sı— than to S:-, 
but not quite significantly so (x? = 3.26, 
df = 1, .05 < p < .10); and that, after 
120 8, trials discriminative extinction is 
significantly faster to S.— than to Sı- 
(х2 = 6.88, df = 1, p < 01). 

Figure 10 compares Sı+, Sı+, and 
5.8, as prediserimination treatments. 
In all cases, number of prediscrimination 
exposures is 120. It is obvious that the 
SitS.-+ treatment severely retards ex- 
tinction to the negative stimulus in dis- 
crimination and that S,-- is followed by 
slower extinction than is Sr, regardless 
of whether it (S1) is positive or negative in 
the discrimination. This much simply re- 
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Ета. 10. Discriminative extinction as a func 
tion of nature of prediscrimination exposure: | 
showing fastest discrimination following Sit 224 „ 
slowest after Sı+S:+, with 5, in between. 
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| flects the operation of PRE in dis- 
crimination. The prediction of a reversal of 
order between the 120 Sız groups and the 
120 Sı+ groups in discrimination would 
require faster extinction to $,— than to 
8,— after 120 8,4 prediscrimination trials. 
This could not be shown with a trials-to- 
eriterion analysis since only a total of eight 
Ss was run under these Si+ conditions, 
and the answer to this question will ob- 
viously require further work. 

Analysis of Prediscrimination and Dis- 
crimination Data from Runways 1 and 2 
for the B+ — B--W—, Ba — B-W+, 
and B42 W+ — B—W + Conditions. Figure 
ll shows a variety of relationships which 
bear on the frustrative nonreward approach 
outlined earlier. Since the separate findings 
of this analysis have been demonstrated 
many times in our research (see Experiment 
1; Amsel, 1958, 1962; Amsel & Roussel, 
1952) our purpose here is primarily to show 
how these findings interrelate. Panels 1 and 
5 of Figure 11 show FE measured in Run- 
way 2 and its relation to partially rewarded 
performanee in Runway 1 in the predis- 
erimination phase (in these graphs the 
prediserimination data for all three groups 
are combined); Panels 2, 3, and 4 show FE 
in Runway 2 during discrimination training, 
for each of the diserimination conditions, 
and particularly the diminution of FE in 
relation to evidence of discrimination in 
Runway 1; and, finally, Panels 6, 7, and 8 
show the differences in rapidity and magni- 
tude of discrimination learning among the 
three groups, all of whieh were exposed to 
the same number of partially rewarded 
prediscrimination trials. 

A theory of frustrative nonreward such 
as has been outlined predicts all of the sepa- 
rate phenomena shown in Figure 11 and 
also their interrelationships. The develop- 
ment of FE during partial-reward acquisi- 
tion has already been demonstrated (Am- 
sel & Hancock, 1957; Roussel, 1952; Wagner, 
1959). The FE shows up first on the fourth 
block of trials (Panel 1) in the present data, 
ie., after 24 partially rewarded trials. The 
earlier findings agree with this very well. 
The decreased vigor and increased variabil- 
ity of performance which depend on con- 


TABLE 6 
NUMBER or 2-Day BLOCKS то CRITERION 
Group 
MOS 1205 2 Seb N Stk 


> dip a NC 
StSt &-5+ Srt SiS} 


1 8 12 4 7 
2 8 12 5 7 
3 9 12 8 7 
4 10 13 8 8 
5 10 13 9 8 
6 10 15 9 9 
7 10 15 9 10 
8 11 16 10 10 
9 13 16 12 10 
10 13 16 12 10 
11 14 17 12 10 
12 mf 18 13 11 
13 17 18+ 13 1 
14 18+ 18+ 14 1 
15 18+ 18+ 15 15+ 
16 18+ 18+ 15+ 
17 18+ 


ditioning of frustration (rp) to runway cues 
and subsequent conflict should show up 
after FE is manifest. Panel 5 shows this 
inerease in goal time at Block 6 (after 40 
trials). This is а highly reliable finding in 
this experiment. While only the combined 
data of all prediserimination groups are 
presented, this rise in the PR acquisition 
curve in Runway 1 appears consistently at 
Block 6 in each of six eurves showing, sepa- 
rately, for each of the three groups, per- 
formance on rewarded and on nonrewarded 
trials. (This amounts to a kind of split-half 
reliability test: there is no basis for differ- 
ential performance in Runway 1 on rewarded 
and nonrewarded trials, and, clearly, per- 
formance is not different on the two kinds 
of trials.) Such evidence of conflict (Stage 
3 in our analysis), of shorter or longer 
duration, seems readily observable in dis- 
crete-trial, partial-reward situations (Am- 
sel, 1958) whenever it is sought. 

The decrease in size of FE attendant upon 
discrimination, which was found in Experi- 
ments 1 and 2, is duplicated in the six 
panels of Figure 11 showing performance 
in discrimination. Panels 2, 3, and 4 are FE 
data (Runway 2) for the conditions shown 
in Panels 6, 7, and 8, respectively. It should 
be apparent that size of FE declines as 


24 


- 


Y, 
e + 
5 
“ 
> 
: 
& 


o 
S 
> 
ә 
% 
3 
z 
n 
Ш 
= 
= 
a 
ш 
2 
2 
Р 


RUNWAY 1 - GOAL (G,) TIME 


B-(oa B-W-) 
Brion Bowe) 


246 в юг м 
PREDISCRIMINATION 


performance to S+ and S— becomes differ- 
ential. Our interpretation of this phe- 
nomenon is that FE depends on Tr, and 
when S discriminates in Runway 1, rz is 
evoked by S+ while ry is evoked by S—. 
Consequently, nonreward in С, following 
successful discrimination is preceded by 
Tr , not rg, and FE is reduced. 

Finally, attention is called to Panels 6, 
7, and 8 comparing the groups for rapidity 
апа magnitude of discrimination. If only 
the extinction curve to the negative stimulus 
had been presented, we would be showing 
the same relationship as was seen in the 
bottom three curves of Figure 10. After 
120 prediscrimination trials, extinction is 
most pronounced when S,4- (B+) is fol- 
lowed by a discrimination in which §; is 
positive and S, (new) is negative; extinction 
to 8— is least pronounced when diserimina- 
ton follows a prediscrimination condition 
where there has been partial reward of 
responses to both discriminanda (B4W+). 

The curves showing response strength to 
the positive discriminanda are shown here 
also to point up a phenomenon that is 
troublesome in this kind of experiment: 


2468 0246 18 
DISCRIMINATION ——_ و‎ 
2-DAY (8-TRIAL) BLOCKS 

Fic. 11. The relationships among prediscrimina 


crimination performance (Panels 6, 7, and 8) measu 
1, 2, 3, and 4, respectively) measured in Runway 2. 
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tion (partial reward) acquisition (Panel 5) and dis- 
red in Runway 1, and the corresponding FEs (Panels 


there is a clear generalization decrement 
(“novelty” reaction?) to the stimulus which 
is new in the discrimination. This shows as 
a rise in the appropriate curve (in Panel 6 
the new stimulus is W—, in Panel 7 it is 
W+); consequently, the negative dis- 
criminandum evokes a temporary decrement 
in performance at the outset of discrimina- 
tion in the first case, while the positive 
diseriminandum shows the temporary decre- 
ment in the second. There is no decrement 
in Panel 8 since neither discriminandum is 
new in the discrimination. We did not 
anticipate this phenomenon, and it shows 
up again in Experiment 4, which was 
started before the termination of Experiment 
3 and before we were aware of it. 


EXPERIMENT 4: PARTIAL REINFORCEMENT 
AND PnEDISCRIMINATION EFFECTS WITH 4 
LARGE REWARD AND SINGLE DAILY 
TRIALS 


In the first three experiments, observa- 
tions were made of discrimination effects 
and frustrative effects in Runways 1 and 2, 
respectively, of the double-runway apparatus. 
Experiments 4 and 5 employed a single- 
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apparatus so that diserimination 
be observed free from any possible 
tai ing effects of Runway 2 measure- 
There were two other differences 
the fourth experiment and the 
three. In Experiment 4 we switched 
Í a procedure involving several trials 
| day to one involving single daily trials. 
the change to a single reward per day, 
also possible to switch from a rela- 
small (.1 gram) reward at the end of 
way 1 to a larger (.5 gram) reward at 
end of this same runway, now employed 
а single runway. 
Tn other respects, Experiment 4 involved 
tions similar to those in the first 
of comparisons (see Table 2) of Experi- 
3. Specifically, three variables were 
pulated: (a) color of prediscrimination 
us (blaek or white), (b) number of 
imination trials (20 or 60), and (c) 
condition of the prediscrimination 
mulus in the eventual discrimination. 
uli were presented successively, one 
il a day. This one-trial-a-day procedure 
ed optimal not only for evaluating 
discrimination effects following different 
nounts of exposure to Sı+ (when S, was 
itive or negative in the discrimination), 
; also for studying trial-to-trial acquisition 
s in partial-reward training with large 


Method 


There were eight experimental groups, each 
h five Ss. In terms of the notation introduced 


(а) 

mination stimulus 
which was either black (В) or white (W), the num- 
ber of mation trials was either 60 or 
20, and the prediscrimination stimulus was associa- 
ted with either reward (+) or nonreward (—) in 
the subsequent discrimination. 


84 


Ss were 40 female albino rats, obtained from 
Woodlyn Farms, Guelph, Ontario. All Ss were 
naive and y 110 days old on arrival 
in the laboratory. Their weight range was between 
116 and 150 grams. 


runway, and a 23-inch goal box at right angles to 
the runway. 


returned to its cage, and fed Purina lab chow, 14 
grams for the first 10 days and 10 grams there- 
after. 

Five squads were formed by choosing at random 


same order throughout the experiment. Each 60- 


TABLE 7 
Summary or ExPERIMENT 3 CONDITIONS 
Prediscrimination Discrimination Restos 
Trials Color Reward condition 
B+ = B+W- x. A 
3 P W4 W+B- 60 84 > 8+8, 
x B+ > B-W+ E 
* 0 We M W-B+ oe OF 
B+ — B+W- 8 A 
à W > W+B- 20 Sit — Sı+S: 
if 
К B — B-W+ t 
Е 2 Wi > W-B+ 20 Sit — Si—Se+ 
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trial S was then allowed to explore the alley of 
the apparatus for 5 minutes daily for 3 days. АП 
20-trial Ss were brought to the experimental room 
each day, but did not begin preliminary training 
until 40 days after the 60-trial Ss. This feature of 
the procedure allowed all Ss to begin discrimina- 
tion training at the same time, thus controlling 
for variables such as minor temperature varia- 
tion, handling, and experience with the carrying 
cage. The Ss of the squad were then returned to 
their home cages and, approximately 30 minutes 
later, were fed their daily ration of food. 

During preliminary training, there was never 
food in the apparatus, and the photoelectric tim- 
ing system was allowed to operate во that Ss might 
become aceustomed to the mild relay noises asso- 
ciated with the system. 

Prediscrimination Training. During this period, 
each 60-trial S received one rewarded or nonre- 
warded trial a day to the appropriate stimulus 
color. The procedure on an individual trial was 
as follows: (a) S was placed in the start box, and 
the orienting door was dropped after about 4 sec- 
onds; (b) 2 seconds later, the Plexiglas start door 
was dropped, and S traversed the runway to the 
goal box, where it was confined; (c) on reward 
trials, S found a .5-gram pellet which it was al- 
lowed to eat before being removed; (d) on nonre- 
ward trials, S was confined in the goal box for 20 
seconds before being removed. The sequence of 
rewards and nonrewards was +——+, —++-, 
+--+, —++ —. The prediscrimination-train- 
ing procedure for the 20-trial Ss was exactly the 
same, but was initiated 40 days later. 

Discrimination Training. Immediately follow- 
ing the prediscrimination trials, discrimination 
training was begun and was continued for 96 trials. 
On a given trial, the runway was either B or Ww 
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in the order BWWB, WBBW, BWWB, WBBW, 
and the sequence of reward and nonreward 9 


the same as іп prediscrimination training, Th 
procedure on a given trial was also the same as ja | 


prediscrimination. 
Results and Discussion 


The data presented and analyzed are in 
every case goal times, since these show most 
clearly the effects of interest—variability 
changes in partial-reward (prediscriming- 
tion) acquisition, and performance changes 
related to the positive and negative dis 
criminanda in the discrimination learning 
phase. 


Prediscrimination Partial-Reward | Acquisi- 
tion 

As indicated earlier, the prediscrimina- 
tion phase of the present experiment, par- 
ticularly those conditions involving the 
larger numbers of trials, provides an excel- 
lent situation in which to observe the 
changes in vigor and variability predicted 
from our analysis of partial-reward acquisi- 
tion. Accordingly, the prediscrimination 
data were examined for evidence of these 
effects. 

Median times over four-trial bloeks were 
computed for each S. In addition, the range 
of these four scores for each block was 
taken for each S. Figure 12 plots for all 


MEAN GOAL-TIME RANGE (SEC.) 


4- TRIAL BLOCKS 
Fie. 12. Goal time and variability 
under one-trial-per-day conditions. 


(range) measures for PR acquisition 
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of the 60-trial Ss mean goal times and 
mean -time ranges across four-trial 
blocks for the 15 prediserimination blocks. 
These graphs show clearly (a) a pattern of 
initial decrease (Blocks 1-5) followed by 
inerease (Blocks 6-10) followed by decrease 
(Blocks 11-15) in goal times and (6) a 

of decrease (Blocks 1-5) followed 
by increase (Blocks 6-15) in average varia- 
bility (range) of goal times. 

A Type I analysis of variance (Lindquist, 
1953) of the goal-time ranges was performed 
for the 60-trial groups, 60 B+ (N = 9) 
60 W+ (N = 9). Neither the interac- 
of color and blocks (F = .92) nor the 

main effect of color (F = 1.92) was signifi- 
_ eant. The two groups were therefore pooled, 
disregarding color, to form Group 60 Si+, 
and a Treatments X Subjects analysis was 
performed to test for changes in variability 
over blocks. The block effect was significant 
at the .001 level, F = 3.85 (df = 13/238). 


Discrimination Learning Phase 
During discrimination, there were four 


trials to the positive stimulus and four 
trials to the negative stimulus in each 
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eight-trial block. Each S's median goal 
times to the positive and negative stimuli 
were computed for each of 12 eight-trial 
blocks, and these were used in the analysis, 

Rate of discrimination learning was de- 
fined in terms of separation of performance 
curves to positive and negative diserimi- 
nanda. The criterion of discrimination learn- 
ing was the first block at which goal time 
to the negative stimulus was greater and 
remained greater than goal time to the 
positive stimulus. 

Statistical analysis of the discrimination 
data indicated that, in terms of this blocks- 
to-criterion measure, none of the three 
variables manipulated in prediserimination 
and discrimination—(a) color of predis- 
crimination stimulus, (b) number of pre- 
discrimination trials, or (c) reward condition 
of the prediserimination stimulus in the 
discrimination—produced a significant effect 
on rate of discrimination learning. 

Figure 13 is а graphie representation of 
discrimination learning for each of the 
color-combined groups in terms of goal 
times to 8+ and S— across the first eight 
eight-trial blocks. The most noteworthy 
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Fic. 13. Discrimination learning as a function of number of prediscrimination trials and whether 
_ the prediscrimination stimulus (Si) appears as the positive (S:+) or negative (S:—) stimulus in the 
_ discrimination (S; is new in the discrimination). 
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feature of these graphs is a generalization 
decrement or “novelty” effect: there is, at 
the start of discrimination training, a decre- 
ment in response strength to the stimulus 
that is new (83). Such a factor could not 
be ruled out in this experiment since all 
prediscrimination treatments involved par- 
tial-reward training to only one of the 
eventual discriminanda. Groups 60 Si+ — 
S:-S:+ and 20 8+ — 8,—8,+ (new 
stimulus positive) show a large decrement 
to 8, at the start of training, which appears 
to retard discrimination. The other groups 
(new stimulus negative) show smaller decre- 
ments to S; at the beginning of training, 
but sufficient so that in the 20-trial condi- 
tion (20 8,2- — S,--8.—) the curves never 
cross, and there is apparent discrimination 
from the start of training. 

The prediscrimination data of Experiment 
4 throw some light on the validity of the 
Sequence of processes hypothesized for par- 
tial-reinforcement acquisition. The varia- 
bility data and the goal-time data for the 
60-trial Ss show (a) an initial decrease 
followed by (6) an increase. This accords 
with the stages where (a) rr—sp is develop- 
ing with early rewards and nonreward has 
no effect; and (b) rx and ғ» have been con- 
ditioned, and their stimuli, sẹ and Sp, 
evoke competing approach and avoidance 
response tendencies. 

However, the variability data, particu- 
larly, suggest that Stage 4, in which the 
temporary conflict is resolved in favor of 
running and s, comes to elicit approach 
tendencies, was not reached in this experi- 
ment. While the downward trend of the 
goal-time eurve from Block 8 to Bloek 15 
suggests an approach to Stage 4, the varia- 
bility curve seems still to be rising at Block 
15. Earlier data (Amsel, 1958) showed the 
temporarily increased variability between 
Trials 24 and 54; increased variability in 
the present study started at about 98 trials. 
This suggests the possibility that more 
trials might have been needed in the present 
study for sp to elicit approach. But, since 
the conditions in the two experiments were 
different (the earlier study involved 6 trials 
per day with water-deprived Ss, the present 
study involved 1 a day with food-deprived 
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8з), only а guess could be made as to the 
number of trials required, and we apparently 
guessed wrongly. | 

If Stage 4 is not reached in predia 
tion aequisition, there is no basis for diffe. 
ential predictions concerning discriminatiog. 
This, added to the generalization decrement 
effect at the outset of discrimination, "up 
gested a change in approach and led [7] 
Experiment 5. 


EXPERIMENT 5: BW As A PREDISCRIME 
NATION CONDITION, AND A DEMONSTRA- 
TION OF WITHIN-SS PR AcqvisiTION 

EFFECTS 


While the preliminary-acquisition data 
of Experiment 4, under S+ conditions, 
afforded some support for our hypotheses 
about stages in acquisition related to varite 
bility changes, the discrimination data were 
somewhat inconclusive. At least part of 
the reason for this was in the nature of the 
experiment itself—and this applies also ê 

riment 3: whenever we switched from 
the prediscrimination condition to the dis 
crimination test, there was a clear generaliza- 
tion decrement in relation to the stimulus 
that was new in the discrimination; 
this obscured, to some extent, the differ 
ences in rate of discrimination. As a conse 
quence, we turned to a new prediscrimina- 
tion condition which would eliminate this 
generalization decrement factor. This in- 
volves exposing Ss to both of the eventual 
discriminanda, S, and S+, while partially 
reinforcing approach responses to one, 
continuously reinforcing these responses to 
the other. The general notation for this 
condition is 5,5,4, although as repo 
here it will be В+ +. It will become 
clear later that this prediscrimination con- 
dition, with its exposure to both discrimi- 
nanda, still permits a test of our Базі 
prediction: that following many trials of 
B+W+, a B+W- discrimination 
appear to be formed earlier than B- W+; 
and that following few such trials, the 
B—W+ discrimination will be apparent 
earlier. ; 

Although the test of rates of discrimina- 
tion was our most urgent business when 
we started this experiment, it soon became 
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cen peared ain, oer e 
, in the prediscrimination phase, 
of at least equal interest. O Чу, 


two findings are reported which are, in a 
sense, products of a single experiment; that 
involve the mme Ss in the same 
It will, however, be more con- 
venient and perhaps more clarifying to 
treat these two aspects of the study sepa- 


The experiment itself is straightforward, 
Performance of rats in a straight-alley run- 
way was examined. In a prediscrimination 


phase, all Ss were subjected to partial 


was run one trial to W+ (rewarded), one 
trial to B+- (rewarded), and one trial to B— 
(not rewarded). Following this phase of the 
experiment, half the Ss of each sort were 
Switched to а B+W — discrimination, and 
the other half to the B—W-- discrimina- 


tration theory. These aspects of the experi- 
ment will be treated separately following а 
more detailed description of the procedure. 


Method 


_ _ Ss were 34 albino rats of the Wistar strain sup- 
plied by Woodlyn Farms, Guelph, Ontario. They 
жеге approximately 90 days old at the beginning 
9f the experiment. Ss were housed in separate 
living cages, maintained on an ad libitum water 
١ ule, and fed at the end of each experimental 
Session, so that their total daily ration including 
reward pellets was 10 grams of Purina lab chow. 
a 


be Apparatus 
The apparatus was a straight, enclosed ply- 
Wood runway consisting of a 15-inch entry box, a 
12-inch start box, a 50-inch alley, and a 12-inch 
m box. Inside width was 214 inches throughout. 
All sections had hinged, clear Plexiglas lids. The 
Walls of the entire runway, including the start 
box, could be made either black or white by in- 
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experimen 
vided by two 25-watt lamps mounted in trans- 
lucent fixtures in the ceiling so as to minimise 


Procedure 

Experimental In 
34 Ss were in the mental condition biack- 
50% reinforcement (Ba) and white-100% rein- 
forcement (W+). However, 12 Ss ran this condi- 


were randomly assigned to one of two discrimina- 
tion condtions, B--W— or B—W+. 
Habituation. Ss were placed on a 10-gram per 


the 
gentled. On the 2 days prior to the experimental 
training, each S was allowed to explore the run- 
way for 5 minutes with timers operating so that 
Ss might become accustomed to the mild clicking 
noises associated with the operation of the equip- 
ment. No reward was present on exploration days. 

Experimental Training. A training trial was 
initiated with the introduction of S into the en- 
try box. АП Ss received three trials per day, sepa- 
rated by about 20 minutes. Positive trials were 
rewarded with a .5-gram Noyes pellet. On nonre- 
ward trials S was removed from the goal box after 
30 seconds. In prediscrimination training all six 
possible ordinal arrangements of В+, B—, W+ 
were presented in any 6-day block. The day-to- 


30 


day order of these six arrangements was ran- 
domized every 6 days. 

Following prediscrimination training, Ss were 
assigned to one of the two conditions (B+W- 
or B—W-+) for discrimination training and con- 
tinued to run three trials per day. The discrimi- 
nanda were presented in 6-day blocks as before, 
but in such a manner as to equalize number of 
positive and negative stimuli every 2 days. The 
order of stimulus presentation was W-W-B«, 
B+B+W—, W—B+B+, B+W—W-, W—B4-- 
W-—, B+W-—B+ and was maintained throughout 
discrimination training. 


Partial-Reinforcement (Acquisition) Effects 
within Sst 

In the prediscrimination (acquisition) 
condition, the 22 Ss of the many group were 
partially reinforced for an approach response 
in the presence of a black stimulus (B+) 
and continuously reinforced for the same 
response in the presence of white (W+). 
This group of Ss, running under B+W+ 
conditions, was carried to 324 trials, and 
an examination of the data uncovered, 
for the first time, a clear indication that 
the partial-reinforcement acquisition effect 
previously shown by others (Goodrich, 1959; 
Haggard, 1959; Wagner, 1961; Weinstock, 
1954) using separate partial- and continuous- 
reinforcement groups—that is, as a between- 
group difference—can be demonstrated to 
occur within Ss. Ss running under the 
B+W-+ condition, partially reinforced to 
black and continuously to white, eventually 
started and ran more rapidly in the pres- 
ence of the stimulus signaling partial re- 
inforeement (BÆ) than to the cue for 
continuous reinforcement (W +), but en- 
tered the goal more slowly to B+ than to 

stis 


Results and Discussion 


The results of this phase are shown in 
Figure 14. Here are plotted mean median 
starting, running, and goal times against 
successive 6-day blocks of trials for B+ 
and for W+ trials separately. For each $ 
а median score of each kind was computed 
for each 6-day block. These were then 
averaged over all Ss for each block. For 


‘This portion of Experiment 5 has been re- 
ported separately and in greater detail elsewhere 
(Amsel, MacKinnon, Rashotte, & Surridge, 1964), 
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starting and running times, the curves 
separate after the first four or five blocks 
(72-90 trials), and the starting and running 
times to B stay lower (faster) from then 
on. For the goal measure, the curves sepa- 
rate from the fourth block on, the W+ 
performance being faster after that. Since 
all within-panel and between-panel com- 
parisons involve the same Ss, rather than 
different groups of Ss, the probabilities 
that successive differences between B+ 
and W+ would remain in the same direc- 
tion could be estimated by raising .5 to the 
fourteenth, thirteenth, and fifteenth powers, 
respectively, for starting, running, and goal 
times. These data are so reliable that they 
can be reproduced, approximately, for 20 
of the 22 Ss individually. The phenomenon 
does not depend on averaging over Ss. 

The generality of these findings might be 
questioned by raising the possibility that 
they depend upon the particular conditions 
of the present experiment. For example, 
the partially reinforced stimulus is always 
black, and the continuously reinforced, 
white. Are the results, then, an artifact of 
black versus white? In addition, the nature 
of the experiment, as we performed it, 
demands that the partially reinforced stimu- 
lus be exposed twice as frequently as the 
continuously reinforced. Do the results in 
any way depend on this factor? Obviously, 
the best answers to these questions await 
further work, and we are now conducting 
the necessary experiments; however, our 
data would seem to suggest that the phe- 
nomenon has some generality. First of all, 
there is the reversal of findings for the 
starting and running measures, on the one 
hand, and the goal measure, on the other. 
If either a black-white preference or number 


of exposures to black as against white - 


produced the effect, it should show up in all 
three measures. The largest differences 
found are in the goal measure, where per- 
formance to white is superior to performance 
to black. Secondly, a trial-by-trial—rather 
than block-by-block—examination of the 
first 18 days (54 trials) shows no initial 
tendency to run faster to black but some 
such tendency to white, the stimulus а850- 


ciated with continuous reward (see Figure 
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. 14. Reversal of positions of W+ and B+ between starting and running measures, on the one hand, 


and the goal measure, on the other, in the within-S’s PR acquisition effect. 


15). This shows up particularly in the 
starting measure where Ss run faster to W+ 
than to B= on 9 out of the last 10 of these 
first 18 days. Examination of later segments 
shows that more vigorous performance in 
the presence of B emerges in the starting 
and running measures at about the time 
the opposite tendency emerges in the goal 
measure—after about 72 trials. 

An initial difference in performance favor- 
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ing the continuous condition has been a 
finding characteristic of experiments where 
separate partial and continuous groups are 
employed (see earlier references). In these 
latter kinds of studies, equal numbers of 
trials are usually run under partial and 
continuous conditions, meaning fewer rein- 
forcements for the former than for the 
latter. In the present experiment numbers 
of reinforcements, and not numbers of trials, 


RUNNING 


PBST (де qp 1 1S. T7149 


DAYS 


Fre. 15. Analysis of performance to W+ and B+ in three response measures fo: the first 18 days. 
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have been equalized between the B+ and 
W-+ conditions; and the initial superiority 
of the continuous conditions is still demon- 
strable. 

"These results confirm and extend findings 
previously only demonstrated between 
groups. They indicate that the mechanisms 
of partial versus continuous reinforcement 
as applied to runway acquisition behavior 
in rats, at least, can exist and operate simul- 
taneously, in the same organism. That is to 
say, organisms in simple learning situations 
can behave in а manner appropriate to 
partial or to continuous (intermittent) re- 
inforeement, at different times, depending 
on the difference in a single environmental 
stimulus. The results are generally similar to 
those obtained when one group is run under 
partial and another under continuous re- 
inforeement in what is otherwise the same 
experimental situation. In such studies, also, 
the asymptotie performance shows, very 
characteristically, greater vigor of approach 
under partial than under continuous rein- 
forcement in the starting and running 
measures and lesser vigor of approach under 
partial than under continuous reinforcement 
in the goal measure. An explanation for the 
superiority of PR acquisition performance 
in terms of frustrative nonreward has been 
offered by Spence (1960). Wagner (1961) 
has gone further to suggest an explanation 
for failure to find PR superiority in the 
goal measure: 


That the partial superiority is not found on those 
response measures close to the goal may be at- 
tributed to less effective conditioning of the ap- 
proach response to the goalbox cues Due s. 
This argument appears quite reasonable in the 
context of the prior assumptions if it is noted 
that ry — s, may be expected to become condi- 
tioned to the approach response when the 8; cues 
are introduced initially at a weak value, and hence 
with a negligible tendency to elicit competing 
responses, and are then increased gradually at the 
same time that the approach response is being 
strengthened. This condition obtains much more 
clearly in the early portions of the alley than in 
the goal region where due to the proximity of the 
primary frustration event, generalized ту — 8; 
may be expected to occur earlier and to follow a 
course of greater intensity, and thus produce re- 
sponses which compete more effectively with the 
approach response [p. 240]. 
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Prediscrimination Experience and Discrimi- 
nation Learning 


7 


In the introduction to Experiments 3 and 
4, Table 4 presented various kinds of pre. 
diserimination exposure to discriminandg 
prior to discrimination training. Experiments 
3 and 4 treated those conditions above the 
dashed line of the table. The present experi- 
ment examines those cases not previously 
considered. In these latter two cases, both 
of the eventual diseriminanda are presented 
prior to discrimination training, and one is 
partially rewarded while the other is con- 
tinuously rewarded. We have restricted 
ourselves to the B+W-+ prediscrimination 
condition, noting that the tests of our hy- 
potheses then involve two discrimination 
tasks: B+W— and B—W+. By selecting 
these tasks and varying the number of 
such prediserimination trials we are able to 
investigate the effects of prior experience 
on later diserimination learning, after study- 
ing asymptotie behavior in relation to the 
stimuli associated with the partial (B+) 
and continuous (W+) reward conditions 
in the prediscrimination phase of the experi- 
ment. 

On the basis of the sequence of hypotheses 
outlined in earlier papers (Amsel, 1958, 
1962), describing stages of development of 
anticipatory reward and anticipatory frus- 
tration in relation to number of partially 
rewarded trials, it was predicted that $8 
switched to discrimination learning after 18 
days (few) of prediscrimination training 
would learn the B— W-- task more readily 
than the B--W — task; whereas Ss switched 
after 108 days (many) would learn the 
B--W-— task before the B—W-+ task 
These relationships are shown graphically 
in Figure 16. It should be noted that, as in 
Experiment 3, the direction of the predic- 
tion depends on number of prediscrimination 
trials and that the two predictions are 
reversed. The points on the graph аге 
meant only to represent relative positions 
and not absolute strengths of performance. 
Such predictions derive from several con- 
siderations. (a) After few trials, cues from 
anticipatory frustration will not have be 
come conditioned to approach, but after 
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Fic. 16. Diagrammatie representation of the 
prediscrimination-discrimination hypothesis when 
few and many B+W-+ trials are the prediscrimina- 
tion conditions. 


extended training they will have become so 
conditioned. Consequently, B+ switched 
to B— will tend to slow discrimination 
after many prediscrimination trials but not 
after few. (b) When, after many trials, W+ 
becomes W—, an immediate frustration 
effect should facilitate the learning of 
B+W--; after few trials this is not such a 
potent factor. (c) When white remains 
Positive in the discrimination and B+ 
changes to B—, after few trials, Ss are at 
a stage where they are running more strongly 
to W+ than to B, because of some con- 
flict to B+, and this should make В — W+ 
an easier discrimination than B--W —. 
These predictions were made before we 
saw the B+W- data. In view of the pre- 
diserimination data, we would expect these 
predictions to hold only for the starting- 
and running-time measures in the present 
experiment, The reasons are these: evidence 
from the prediscrimination phase can be 
taken to indicate that in the segments in 
which starting- and running-time measures 


are taken, г, evokes approach in the many- 
trial group and avoidance in the few-trial 
group; while in the goal-time segment, s, 
evokes avoidance in both groups. Where 5, 
elicits approach, evidence of discrimination 
will be retarded and depend on the breaking 
down of such tendencies; where s, elicits 
avoidance, discrimination will be apparent 
almost from the outset. Accordingly, faster 
discrimination of B--W— than B—W-4- 
would be expected to occur in starting and 
running times after many trials, since the 
mechanism of persistence is operating to 
slow discriminative extinction to B—. On 
the other hand, faster discrimination of 
B—W-4 than B+W-— should be observed 
in the goal time after many, and in starting 
and rynning times after few trials, since in 
these cases the prediscrimination treatment 
has increased the susceptibility of B to 
rapid extinction. 


Results and Discussion 


A scattergram of each day’s raw data 
was plotted throughout prediscrimination 
training for the Ss of the many-trial group. 
An increase in variability occurred about 
Days 15-18 and was taken as evidence of 
anticipatory frustration producing conflict. 
This evidence of increased variability sub- 
sequently disappeared and the В-Е measures 
of starting and running speeds exceeded 
the W+ measures (see earlier analysis). 
The Ss of the few-trial group were switched 
from the B+W + training to discrimination 
(B--W— or B—W-4-) at the stage where 
evidence of confliet was greatest (Day 18). 

The results shown in Figure 17 are mean 
median starting, running, and goal times 
over 6-day blocks for groups receiving the 
four eombinations of number of prior pre- 
discrimination trials, and type of diserimina- 
tion task (M/B+W-, M/B—W-+, F/ 
B+W-, F/B—W+). Prediscrimination 
data of the 108-day group, shown again in 
Figure 17, provide a basis for evaluating 
discrimination effects. The prediscrimina- 
tion data for the 18-day groups are not 
shown since they extend only over the 
first three points. 

While the many groups were run 12 6-day 
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Fic. 17. Comparison of discriminative performance in the three measures 
following few and many B4W-+ trials. 


blocks in discrimination, the few groups were 
discontinued after 5 such blocks when their 
times to run to S— were already very long. 
This difference in rate of build-up of dis- 
crimination, as between the few- and many- 
trial groups, generally considered, reflects 
the action of PRE mechanism in discrimina- 
tive extinction. The magnitude of this 
difference can be appreciated by comparing 
the few (dashed line) curves with the many, 
particularly at the fifth point. There are, in 
addition to this general characteristic of the 
results, some particular differences of in- 
terest. These relate more specifically to the 
differential predictions which seemed possi- 


ble from our analysis and from the charac 
teristics of the prediscrimination data (left- 
hand panels of Figure 17). No. 
Our analysis suggested a retardation i" 
discrimination, measured in terms of start- 
ing and running times, for M/B—W-*t 
compared with M/B--W —, and the reverse 
for the F/B—W+ and F/B+W— cor 
ditions. These measures show that Group 
M/B--W- was “discriminating” on the 
first block of trials, while the discriminatio) 
of M/B—W-+ was not apparent until Bloc 
6 in the running and Block 9 in the аг, 
time. Starting measures of Groups | 
B+W- and F/B—W- show discrimin® 
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tion beginning at Block 2; however, the 
B-W-4 discrimination is more marked 
on Blocks 3-5. The difference in running 
time between the few groups, although not 
great; is in the opposite direction from that 
of the many groups, i.e., B—W-4- seems to 
produce faster discrimination than B+W — 
in the former, while the reverse seems to 
hold in the latter. Consequently, the data 
for the few groups were analyzed in the 
following manner: median running times 
to S+ and S— were computed over 2-day 
rather than 6-day blocks, and the block 
on which each S showed discrimination 
without subsequent reversal was determined. 
The six Ss in B—W + reached the criterion 
on Blocks 1, 1, 1, 1, 7, and 10. The compara- 
ble figures for Ss running B+W— were 
3, 4, 7, 7, 7, and 14. A Mann-Whitney U 
test allows rejection of the null hypothesis 
(U = 7, p < .05). 

Our expectation was that the goal-time 
measure would show faster discrimination 
of B—W+ than of B+W-— regardless of 
number of prior trials. While there seems 
little question that this expectation is con- 
firmed in the Few groups, the difference 
in the case of the Many groups, although 
in the expected direction, is not as clear. 
A breakdown of the first portion of these 
curves into 2-day blocks instead of 6-day 
blocks shows B—W-+ curves separated 
from the outset, as would be expected if B— 
in discrimination acts like B+ in predis- 
crimination. The B+W-— separation starts 
reversed, and the W— does not begin to 
rise until the sixth 2-day block. These 
results, opposite in direction to the running- 
and starting-time data, would be difficult 
to understand were it not for the predis- 
crimination goal-time performance. 

Examination of the discrimination data 
for the few groups, particularly for running 
and goal time, reveals that Ss in the B+ W — 
condition ran slower to B+ than did Ss in 
the B—W-+ condition to W+, while per- 
formance to the negative stimulus seemed 
unaffected by its color. It would appear, 
then, that switching from B+W-+ as а 
prediscrimination condition affects positive 
discriminanda more than negative ones. 
` This makes sense if one accepts that, at this 


(conflict) stage, s» is not yet conditioned 
to approach. In this case, the main effect of 
prediscrimination experience is to make 
discrimination behavior to B+ following 
B+ more variable and less strong (on the 
average) while behavior to W+ remains at 
its prediscrimination level. This reasoning 
was part of the basis for the predictions 
offered earlier. 


CONCLUDING CONSIDERATIONS 


The experiments reported address them- 
selves to the role of frustrative factors in 
the extinction of responses, both in simple 
instrumental approach learning and in 
discrimination learning and, therefore, to 
the more general problem of persistence. 


Nonreward and Persistence 

Theories of learning proposing positive 
monotonic relationships between number of 
reinforcements and resistance to extinction 
(e.g., Hull, 1943) are unsuitable conceptuali- 
zations of PRE and the related findings on 
resistance to discrimination reported in this 
monograph. For example, the results of 
Experiment 2 are that 48 rewards of re- 
sponses to Sı+S:+, compared with 12 
rewards, sets up a faster SitS:— (or 
$:—S.+) discrimination. Here is a case 
where number of rewards and resistance to 
discriminative extinction are inversely re- 
lated. On the other hand, Experiments 3 and 
5 appear to indicate a positive relationship 
between number of partial rewards in re- 
sponse to a stimulus (Sı) and resistance to 
discriminative extinction when that stimulus 
becomes negative (S; —824-) in a subsequent 
discrimination. 

Statistical association theories of learning 
(e.g, Estes, 1959) have not addressed 
themselves particularly to the problem of 
persistence. They have dealt more with 
acquisition functions, and would seem to 
require (а) monotonicity in such functions 
when the response is partially rewarded, 
and (b) asymptotic performance levels cor- 
responding directly to percentage or proba- 
bility of reinforcement. If so, our data 
suggest that such theories represent over- 
simplifications of the psychological mecha- 
nisms in acquisition of partially rewarded 
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responses. With respect to monotonicity, 
statistical association theory would have to 
incorporate some means of dealing with 
the increases in variability of performance 
that occur at a certain stage (Stage 3, in our 
analysis) of partial-reward acquisition (see 
Panel 5, Figure 11; also Experiment 4 of the 
present report). As for both monotonicity 
and acquisition asymptotes, statistical 
theory would seem not to be the appropriate 
conceptualization of the paradoxical acquisi- 
tion effects, varying with separate segments 
of the response chain, demonstrated within 
Ss in Experiment 5. 

It would appear that such relationships 
as are presented here require for their 
explanation a theoretical approach to learn- 
ing that ascribes active properties te non- 
reward (Amsel, 1962), a characteristic of 
neither classical Hullian theory nor the 
statistical association theory, to mention 
only two of the more conventional ap- 
proaches. There are, at present, two obvious 
alternatives, in terms of theoretical strategy, 
for conceptualizing the role of nonreward 
in acquisition and the building of persis- 
tence. In general and nontechnical terms, 
one of the alternatives holds that persis- 
tence is the result of learning to continue 
to a goal in the face of indications of failure 
or uncertainty; the other, that individuals 
are persistent when they have “come to 
love things for which they have suffered 
[Festinger, 1961, p. 11].” Neither of these 
positions, stated in the language of common 
sense, does much more than beg the ques- 
tion of persistence. However, both positions 
have been offered more formally and analyti- 
cally, with attempts to identify the causal 
mechanisms, and have stimulated much 
experimental work. The earlier of the two 
(Amsel, 1958; Wilson, Weiss, & Amsel, 
1955), the antieipatory frustration account 
of persistence, has been detailed in the in- 
‘troduction to this paper; and the hypotheses 
tested in the present paper were derived 
from it. The later account (Festinger, 1961; 
Lawrence & Festinger, 1962) proposes that 
PRE and persistence are indicants of in- 
creased attractiveness (“extra attractions” 
of a goal region, or of an activity toward a 
goal, acquired as the result of cognitive 


dissonance created for an animal by їп. 
cient rewards. 

The S-R account of the effects of active 
nonreward treats persistence as a learned 
tendency to continue responding (approach 
ing) in the face of feedback stimulation 
from an anticipatory frustration response, 
The mechanism of persistence, according 
to such an analysis, is з, — Rapp. The 
experiments in this paper and the discussion 
of them have been suggested by this kind 
of theory. The approach seems moderately 
fruitful, and the degree of fit between theory 
and experiments has already been discussed. 

The cognitive account of the active prop- 
erties of nonreward by Lawrence and Fest- 
inger was not designed to handle the findings 
of this paper; nevertheless, such a theory 
would eventually have to encompass these 
phenomena. Extra attractions for the ac 
tivity or the apparatus or the goal events 
cannot, it would seem, account for the 
phenomenon of extinction to begin with. 
Extinction implies continuous nonrewards, 
and Lawrence and Festinger have indicated 
that zero reward is the optimal condition 
for building extra attractions in the goal 
region. This neocognitive theory, which has 
a certain intuitive validity for describing 
facets of human behavior, does not seem 
to have the power of a more analytical 
approach based on conditioning premises. 
While it accounts for increased resistance to 
extinction after partial reinforcement, it has 
difficulty with ultimate extinction. While 
it accounts for PR extinction effects, it 
would have difficulty with the complicated 
PR acquisition effects—already in the litera- 
ture and to which we have referred —and 
particularly with the results of our Experi- 
ment 5. Why should S start and run faster 
to a stimulus associated with insufficient 
rewards (as compared to a stimulus ass0- 
ciated with continuous rewards), but enter 
the “extra-attractive” goal area more slowly 
in the presence of this stimulus? How does 
increased attractiveness account for the | 
increased variability that develops in all | 
PR conditions, particularly with large 16 | 
wards (Experiment 4)? Finally, how does | 
increased attractiveness of activity, the 508 
region, or the goal event account for the | 


FRUSTRATION AND RESISTANCE TO DISCHIMINATION 


findings of Experiment 3—that a B&W 
prediscrimination condition retards subse- 
quent (B+-W—) discrimination much more 
drastically than does a B+ prediscrimina- 
tion condition? The conditions have involved 
an equal number of nonreward experiences. 
Why is a B—W-+ discrimination faster 
than B--W — after a small number of B+ 
prediserimination trials, and slower after 
a large number of such prediscrimination 
trials? 

Assuming that the findings of our experi- 
ments are replicable, these questions and 
others must be asked of any theoretical 
approach to the study of nonreward effects. 
Two other, more general, questions arise. 
One pertains to a distinction between frus- 
trative and nonfrustrative persistence. (We 
аге not suggesting that all persistence is 
necessarily frustrative.) The second is about 
the relationships that may obtain between 
persistence, vigor, and choice as dimensions 
of behavior. 


Frustrative and Nonfrustrative Persistence 


If resistance to extinction is the proto- 
type of persistence, PRE is an indication 
that persistence is greater following partial 
than following continuous reward, the rule 
being that partial reward in acquisition 
leads to greater resistance to extinction. The 
current experiments along with accumulat- 
ing evidence from other experiments suggest 
that this rule is a great oversimplification; 
that the presence of PRE in extinction, and 
of PRE-like effects in discrimination, de- 
pend on a variety of factors, including 
amount of PR experience; that, in fact, we 
can make continuously rewarded Ss re- 
sistant to extinction (persistent) and par- 
tially rewarded Ss nonpersistent. The con- 
jecture, in connection with this latter 
possibility, is that there are at least two 
kinds of persistence, ‘“frustrative” and “поп- 
frustrative," one of which necessarily in- 
volves partial reward, the other not. 

An example of frustrative (or “‘excited”’) 
persistence is of course the ordinary partial 
reinforcement experiment in which per- 

| Sistenee is built into the partial group by 
_ allowing frustration to occur in the acquisi- 
. tion (and extinction) of the response, while 


persistence is weakened in the continuous 
group by allowing little or no frustration in 
acquisition, exposing S to frustration only 
in extinction. Recent experiments have 
demonstrated that extinction is facilitated 
by relatively large numbers of continuous 
rewards (e.g., North & Stimmel, 1960) and 
relatively high reward magnitudes (Armus, 
1959; Hulse, 1958; Wagner, 1961). Such 
relative nonpersistence might also be ac- 
complished in a partially rewarded group 
by holding frustration down in acquisition 
(but not in extinction) with a sedative or 
tranquilizer. The rationale of such a proce- 
dure would be that the effect of the drug in 
acquisition was to suppress the development 
of frustration during partial reward and 
hence prevent the formation of a connection 
between anticipatory frustration-produced 
cues and responding. A necessary feature 
of the sedative or tranquilizer in this case 
would be that it have no effect in acquisition 
on the development of re, which would 
have to be present later in extinction for 
the occurrence of frustration and rapid 
elimination of the response. 

Persistence may also be built by allowing 
no excitement in either acquisition or ex- 
tinction. An example of such nonfrustrative 
(or "phlegmatic") persistence would be 
acquisition under continuous reward fol- 
lowed by extinction under the influence of а 
sedative or tranquilizer. It could also take 
the form of reappearance of an extinguished 
response (or response to 8—) when a drug 
is administered following extinction (or dis- 
crimination). Persistence, in these cases, 
would be due to the failure of frustration to 
operate as an aversive factor. In this con- 
nection a recent report by Terrace (1963b) 
is instructive. He has shown that administer- 
ing a tranquilizer will greatly increase the 
number of “error” responses made by pi- 
geons to the negative discriminandum after 
perfect discrimination had been learned in 
the usual way, i.e., reward for responses to 
S-- and nonrewards for responses to S—. 
On the other hand, when discrimination 
learning had been managed in such a way 
as to prevent the pigeon from making any 
responses to S— which would be nonre- 
warded, that is to say, when discrimination _ 
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learning was “errorless,” the drug injection 
produced no alteration in discriminative 
performance, It would appear from Terrace's 
work that if nonrewards are involved in the 
formation of a diserimination, then the 
aversiveness of S— is a necessary factor in 
maintaining the discrimination. The removal 
of this aversiveness through the administra- 
tion of a drug causes the discrimination to 
break down dramatically, and S returns to 
responding to S—. If the discrimination in- 
volves no aversiveness, as is presumably the 
case in the errorless discrimination proce- 
dure (Terrace, 1963a, 1963c), S— has never 
been allowed to become aversive (the pro- 
cedure is such that S— is introduced very 
gradually, and in such a wgy that S never 
responds to it, and there can, of course, be 
some question about whether this is genuine 
discriminative responding) and, conse- 
quently, the injeetion of a drug cannot 
attenuate the aversiveness and produce 
responding to S—. Such renewed responding 
to S—, after a discrimination has been 
learned with nonrewards of responses to 
8— and frustrative control of discrimination 
(or extinetion) is removed by a drug, is an 
example of phlegmatic persistence. 


Persistence in Relation to V. igor and Choice 


Experiment 5, partieularly, suggests an 
interesting new direction for research which 
may be expected to provide perplexing 
findings almost regardless of the outcome. 
Consider, for a moment, only the starting- 
and running-time phenomena of this ex- 
periment. The prediscrimination condition 
was B+W-+, and asymptotic performance 
after many trials showed greater vigor of 
responding to B than to W+. There was 
also an indication that B was more resistant 
to extinction than W in the sense that the 
B—W-+ discrimination (response to B ex- 
tinguished) was retarded relative to the 
B--W-— discrimination (response to W 
extinguished)—all of this after many trials. 

Consider, now, the matter of choice be- 
havior in relation to B+ and W-+ at the 
late stage represented above. Given a situ- 
ation wherein these stimuli are exposed in 
such a manner as to neutralize all choice- 
inducing factors other than reactions to the 


stimuli themselves, which stimulus should 
be selected (“preferred”) by 5? Remembe 
that in the experiment, where starting and 
running are faster to B+ than to Wt, 
there is no opportunity for choice—f $ 
faced with a single alley which is either B 
ог W. The question is: does increased vige 
and persistence, presumably related to the 
connection of sy to approach in the black 
alley, imply that 5 will also choose B ove 
W when not forced to run in B? The situs 
tion is complicated, and there is no u 
equivoeal predietion of the outcome; how. 
ever, the problem is interesting, and it seems 
to emerge directly out of the within-S par. 
tial-reward experiment. 

An experiment recently reported by 
Davenport (1963) comes close to examining 
this kind of problem. Using a free-and- 
forced-trials procedure in a T maze, in at 
experimental study of reversal following 
spatial learning, he varied percentages af 
reinforcement (100:0, 100:33, 100:67) in 
original learning and found that most 5 
developed preferences for the 100% side 
early and showed no tendency to switch to 
the 33% or 67% side later in training. 
Davenport observes that this seems not to 
agree with results from runway studie 
(e.g., Goodrich, 1959; Weinstock, 1958) 
whieh show higher asymptotic performance 
in acquisition for partial than for continu 
ous groups, where separate groups Welê 
trained under each condition. While the 
Davenport experiment does not find sup- 
porting evidence for the paradoxical asymp | 
totie effect, our within-S procedure does 
and there would seem to be two good reasons 
for the difference (and the last sentence of 
Davenport’s article shows he is aware 0 
these). (a) The paradoxical effect does not 
occur in the goal measure, when runway 
performance is fractionated into starting: 
running, and goal segments. In fact, the 
goal measure typically shows the rever 
effect, continuous reward being related #0 
faster running than partial in this segmen 
(Spence, 1960). If behavior at the choice 
point of а T maze is more comparable E 
goal entry in a runway than to the othe" 
segments, the Davenport result is not W 
expected. (b) The more important, 601 
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sideration, however, is that the effect in 
question (partial faster than continuous in 
starting and running measures, reversed in 
the goal measure) refers to a vigor or am- 
plitude dimension of behavior, and the rela- 
tion of this phenomenon to choice behavior 
(in the same Ss) is an interesting but, so far, 
neglected problem. The within-Ss PR ex- 
periment of Experiment 5 may provide a 
means of investigating this problem. 


SUMMARY 


When some kind of prediserimination ex- 
posure to one or both of the discriminanda 
precedes their presentation as S+ and 8— 
in а successive diserimination, this preexpo- 
sure may either hasten or retard the de- 
velopment of differential responding to the 
two stimuli. Such retardation, relative to 
some appropriate control condition, results 
from the transfer to the subsequent dis- 
crimination learning of mechanisms ac- 
quired in the earlier prediserimination 
treatment. It is plausible that these mecha- 
nisms are the same as those transferred from 
acquisition to extinction in PRE, and that 
in both situations they embrace much of 
what is ordinarily meant by persistence. In 
this sense, then, persistence refers to (a) a 
learned tendency to continue to respond in 
the face of negative (nongoal) indications, 
and (b) a learned retardation of discrimina- 
tion evidenced by relative failure to respond 
differentially to two (or more) stimuli, one 
signaling reward and the other(s) nonreward. 

This monograph reports a variety of ex- 
perimental approaches for studying pre- 
diserimination effects. Experiment 1 pro- 
vided a demonstration of the involvement 
of frustration in discrimination learning, 
and lends plausibility and support to the 
four-stage hypothesis relating frustrative 
factors to partial reinforcement effects as 
they operate in discrimination learning with 
Separate (successive) presentation of stim- 
uli. The basie set of relationships found to 
hold between discriminative performance 
ànd level of frustration, demonstrated in 
Experiment 1, encouraged further experi- 
ments to test hypotheses concerning pre- 
diserimination effects. Some of these tests 


were carried out in Experiments 2, 3, 4, and 
ә. 
In Experiment 2, Ss were exposed to pre- 
discrimination reward in relation to both 
discriminanda (S,--8.4-), the variable being 
number of such exposures before learning 
an S--8;— discrimination. This provided a 
test of the hypothesis that, with initial 
positive tendencies to the two diseriminanda 
equal, rate of discrimination learning is a 
positive function of strength of r, to the 
negative stimulus (8:—) in the discrimina- 
tion. 

Experiment 3 provided evidence (a) that 
discriminative extinction (and discrimina- 
tion) is faster after 32 S,2- exposures than 
after 120 such exposures, confirming the 
operation of PRE in failure to discriminate 
(discriminative persistence); (b) that after 
32 partial rewards to Sı, discriminative 
extinction is faster when S; is the negative 
discriminandum than when a new stimulus 
(S:) is negative, while the reverse is true 
after 120 Sı trials; and (c) that retardation 
of discrimination is most drastic when the 
prediscrimination treatment is 8,8, 
partial reward preexposure to both of the 
eventual discriminanda. 

The results of Experiment 4 reflected most 
meaningfully on the third stage of our par- 
tial-reward hypothesis, which prediets an 
increase in variability at a certain point in 
partial-reward acquisition. The preliminary 
portion of this experiment was a PR con- 
dition run one trial a day with a large re- 
ward. It demonstrated clearly the conflict 
stage hypothesized to occur in PR acquisi- 
tion. 

Finally, Experiment 5 contained a pre- 
discrimination condition (S;+S2+) in which 
there was either a large or small number of 
partial-reward trials in relation to one stim- 
ulus and continuous-reward trials in relation 
to another. Apart from the implications of 
such conditions for subsequent discrimina- 
tion learning, the prediscrimination condi- 
tion, itself, proved very interesting in that 
it revealed PR acquisition effects within Ss— 
higher asymptotic performance to Si+ 
than to 8,4 in starting and running (alley) 
times, but lower asymptotic performance 
to S:+ than to S+ in goal times. This re- 
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sult had previously been demonstrated only 
as а between-groups effect. 

The concluding discussion considered 
these data on the relation of nonreward to 
persistence in the light of different theo- 
retical treatments (Hull, Estes, Lawrence, 
and Festinger). There was also a discussion 
of frustrative and nonfrustrative persistence 


and a suggestion for new research dine. 
tions, notably, the possibility of studying 
in the same organism the relationships 
among the behavioral dimensions, vigo, 
persistence, and choice. The concluding 
discussion described the characteristics ам 
the implications of such a program of me 
search. 
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T= role of salivary secretions in the 
regulation of food and fluid intake 
has been considered almost exclusively in 
connection with Cannon’s theory of thirst, 
` which maintained that thirst was synony- 
mous with a dryness of the oral cavity 
normally resulting from а reduced salivary 
flow (Cannon, 1918). This xerostomie, or 
dry mouth, view of thirst has been espoused 
more recently by Gregerson and Cizek 
(1961) who, after reviewing the several fac- 
tors giving rise to an increased water in- 
take, conclude that, “The common factor 
...in all forms of thirst examined appears 
to be a reduction in salivary flow [p. 325].” 

The “dry mouth” theory, however, has 
received anything but unqualified experi- 
mental support (for review, see Wolf, 
1958). Despite a sizable literature on xero- 
stomia and water intake, only a few studies 
have dealt with desalivate animals. Bidder 
and Schmidt (1852) found an enhanced 
ater intake in desalivate dogs, and Gre- 
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OBSERVATIONS ON THE ROLE OF SALIVARY SECRETIONS 
IN THE REGULATION OF FOOD AND FLUID INTAKE 
IN THE WHITE RAT' 


W. B. VANCE 
Columbia University 


Observations on the food and water intake, specific hunger, and taste pref- 
erence behavior of rats made desalivate by ligation of the principal salivary 
ducts are described. Desalivate rats drink excessive amounts of water when 


tion strongly suggest that alterations in salivary composition may alter the 
peripheral taste receptor response an 


stances which normally stimulate the receptors in the course of ingestion. 


gerson and Cannon (1932) found that de- 
salivate dogs more than doubled their water 
intake over preoperative levels when the 
animals were exposed to а warm environ- 
ment, which induced panting and conse- 
quent drying of the oral mucosa. On the 
other hand, Fehr (1862) and Montgomery 
(1931a) observed little change in the nor- 
mal, long-term water intake in totally de- 
salivate dogs, and no major changes in 
long-term water balance are observed in 
humans with congenital absence of saliva 
(Austin & Steggerda, 1936; Steggerda, 
1939; 1941; Zaus, 1936). 

The use of drugs to facilitate salivation 
has also failed to establish rate of salivary 
flow as an important variable in water in- 
take. Pilocarpine, which produces a profuse 
salivary flow, does not reduce water intake 
in rats if given in low dosages, but does 
inhibit drinking at higher dosages even 
though the amount of salivation at the two 
dosage levels is the same (Adolph, 1948). 
In the dog, Kleitman (1927) found pilo- 
carpine not only failed to reduce water in- 
take, but generally produced an inereased 
water intake on the day of injection. In the 
rabbit, on the other hand, pilocarpine causes 
a severe reduction of water intake in water- 
deprived animals (Pack, 1923), although 
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this result шау be due to the prostrating 
effects of the drug (Gregerson & Cannon, 
1932). Montgomery (1931b) failed to ob- 
serve any sizable differences in water in- 
take between normal and desalivate dogs 
after pilocarpine treatment, which strongly 
suggests that any effects of this drug on 
water intake are due to extraoral factors. 

The notion that thirst is mediated by a 
local stimulus or stimuli confined to the 
oral cavity has also been investigated by 
observing the effects of oral anesthesia or 
denervation on water intake. Bellows and 
Van Wagenen (1939) failed to note any 
changes in water intake in dogs with bi- 
lateral divisions of the trigeminal, glosso- 
pharyngeal and chorda tympani, or olfac- 
tory tract, but Valenti (1909) found that 
cocaine applied to the oral cavity, throat, 
and upper esophagus completely inhibited 
drinking in water-deprived dogs. In hu- 
mans, locally anesthetizing the oral cavity 
appears to have little effect on the water 
intake of individuals with diabetes insipi- 
dus (Rowntree, 1922), nor does it reduce 
the thirst of hypernatremia (Leschke, 
1918). 

While the role of saliva in thirst has at- 
tended largely to the volume or rate of sali- 
vary secretion, there have been some sug- 
gestions that the chemical composition of 
saliva may influence taste receptor response 
and thereby the volume intake of sub- 
stances which normally stimulate taste re- 
ceptors in the course of ingestion. Richter 
and MacLean (1939) considered the possi- 
bility that the threshold for NaCl is de- 
pendent on the salivary Na concentration, 
and McBurney and Pfaffmann (1963) have 
recently shown that human thresholds for 
NaCl are higher when the tongue is adapted 
to saliva as opposed to distilled water. The 
fact that human NaCl thresholds are found 
to lie just above the concentration of an 
adapting NaCl solution is well documented 
(Abrahams, Krakauer, & Dallenbach, 
1937; Hahn, 1934; McBurney & Pfaff- 
mann, 1963). In addition to threshold 
shifts induced by adaptation level, Barto- 
shuk, McBurney, and Pfafimann (1964) 
have shown that adapting the tongue to a 
given NaCl solution and then stimulating 
with concentrations of NaCl slightly above 


or below the adaptation level gives rise not 
only to changes in threshold, but also te 
changes in taste quality. 

Powers and Pfaffmann (Pfafimann, 1961) 
have further demonstrated that the dis. 
charge of the rat ehorda tympani decrease 
from a high resting level in response to Д 
М NaCl solution if the tongue has been 
previously adapted to Ringer's solution, 
while an inerease from a low resting level 
is observed if the tongue has been adapted 
to distilled water prior to stimulation. The 
magnitude of the changes from the respet- 
tive resting levels and the initial absolute 
level of discharge were also different in the 
two cases. Thus, two dissimilar responses 
result from the same stimulus, depending 
on the character of the fluid bathing the 
receptors prior to stimulation. It is not 
unreasonable, therefore, to suspect that 
changes in salivary composition might re- 
sult in similar changes in taste receptor re- 
sponse. 

In addition to providing a constant adap- 
tation level to taste receptors, saliva may 
alter taste responses by other means. In 
the ability of humans to detect the bitter 
tasting phenothiocarbamide (PTC), Cohen 
and Ogden (1949) found that PTC must 
dissolve in the taster's own saliva in order 
to be detected, and Fisher and Griffin 
(1963) have recently shown that the in 
vitro oxidation of the bitter thioureas pro- 
ceeds faster in the saliva of nontasters than 
in the saliva of tasters. The rapid transfor- 
mation of these substances by the saliva of 
nontasters could account for the failure to 
detect them at low concentrations, since 
rapid conversion would have the effect of 
reducing the concentration even further be- 
fore reaching the receptor site. 

Saliva thus holds the possibility of alter- 
ing taste receptor output in three Ways: 
(a) by providing a constant adapting stim- 
ulus to receptors sensitive to constituents 
of saliva, the adaptation level varying wit 
the concentration of the saliva constituen 
e.g., NaCl, (b) by acting as a solvent, e£ 
PTC, or (c) by chemically modifying the 
ingested substance before it reaches the re- 
ceptors, e.g., the oxidation of the thioureas- 

The chemical diversity of saliva (Spec 
tor, 1956) and its constant contact wit 


| 


SALIVARY Secrerion—Anp Еоор лхо FLUID INTAKE 3 


taste cells certainly offers the possibility 
for other, perhaps more potent, modifica- 
tion of taste cell response and consequently 
of the volume intake of those materials 
which stimulate taste receptors upon con- 
sumption. For example, Kistiakovsky 
(1950) has offered the suggestion that 
chemical receptor response may be related 
to а system of enzymatically eatalyzed re- 
actions at the receptor site, and differential 
enzymatie activity has been observed in 
the rabbit taste bud following stimulation 
by various materials (El-Baradi & Bourne, 
1951). That such enzymatie activity could 
be modified by the chemieal or physico- 
chemical properties of saliva, e.g pH, 
seems not unlikely. 

Since there have been no studies on the 
role of salivary secretions in the intake 
regulation of substances other than water, 
the following experiments were conducted 
to evaluate alterations in food and water 
intake, taste preferences, and specifie hun- 
gers in rats deprived of saliva by ligation 
of the salivary ducts. 


METHODS 


There are four pairs of salivary glands in the 
rat: the parotid, submaxillary, and major and 
minor sublingual (Greene, 1959). The submaxil- 
lary and major sublingual glands lie in close ap- 
position, appearing macroscopically as a single 
anatomical unit as do their separate ducts. The 
term “desalivate” will refer hereafter to animals 
whose parotid, submaxillary, and major sublingual 
ducts have been ligated. The minor sublinguals in 
all cases were left intact, primarily because their 
inaccessible location involves considerable trauma 
to muscles of mastication and deglutition when 
they are removed or their ducts ligated. The minor 
sublinguals are quite small and contribute negligi- 
bly to total saliva volume (Schneyer & Schneyer, 
1959а). 

The operative procedure involved an incision of 
approximately 2 em. on the ventral throat, through 
which the parotid ducts could be located crossing 
the lateral aspect of the masseter, and the parallel 
ducts of the submaxillary and major sublingual 
glands could be located just prior to their disap- 
Pearance beneath the anterior belly of the digas- 
tric (see Greene, 1959). In the case of the latter 
ducts, a single ligature served to occlude both. 
The ligatures of the parotid ducts included the 


. ramus mandibularis marginalis, a sensory-motor 


branch of the fifth cranial nerve serving the lower 
lip. This nerve runs in close association with the 
duct, and freeing the duct invariably involves un- 


-assessable damage to the nerve. By including the 


nerve in the ligature, all desalivate animals are 
made uniform with respect to the deficit produced 
thereby. All operations were performed under 
ether anesthesia. 


Materials 


White rats of the Sprague-Dawley strain were 
used as subjects in all experiments and were 
housed in individual wire mesh cages. Fluid intake 
was measured to the nearest milliliter, from in- 
verted 100-ml. graduated cylinders (Nalgene) 
fitted with rubber stoppers and either glass or 
metal spouts having inside diameters of 3 mm. 
and 6 mm. respectively. The cylinders were 
mounted on the fronts of the cages, and the 
drinking spouts projected approximately 2 in. in- 
side the cage, with the tip approximately 6 in, 
above the cage floor. Except where otherwise 
noted, Purina chew food was used exclusively in 
all experiments and was given either in the form 


of pellets or powder. The powdered diet was pre- 
sented in metal feeders, while the pellet diet was 
simply placed on the cage floor. All measurements 
of food intake are to the nearest gram. All solu- 
tions referred to are in gm.%, i.e., grams of solute 
per 100 ml. of solution. Temperature was main- 
tained within 2° of 73° F, and there was no hu- 
midity control. 


EXPERIMENT 1: EFFECTS ОЕ DESALIVATION 
ox Foop AND WATER INTAKE 


The effects of desalivation on long-term 
food and water intake were investigated in 
10 animals, 6 females and 4 males, weigh- 
ing from 200 to 300 gm. АП animals were 
maintained in the experimental cages for 
at least 2 weeks prior to the start of the 
experiment. Body weight, food (powdered 
diet), and water (metal spouts) consump- 
tion were measured for 7 days prior to de- 
salivation and for 25 days following de- 
salivation. Food and water intake were 
recorded daily, and body weight was re- 
corded on alternate days. 

The mean values for each of these meas- 
urements are presented in Figure 1, with 
body weight expressed as percentage of 
body weight at the start of the experiment 
(Day —6). On the first postoperative day, 
food intake drops to zero with a gradual 
rise in food intake thereafter until preop- 
erative levels are approached on the 23rd 
postoperative day. Similarly, body weight 
drops off sharply following desalivation, 
but levels off at about 88% of original body 
weight. The failure to recover the preopera- 
tive weight was true for all but one animal. 
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Fic. 1. The effects of desalivation on normal food and water intake. Top panel, body weight as 
percentage of original body weight; middle panel, water intake in milliliters; bottom panel, food intake 


in grams. 


The return of food intake to preoperative 
levels, therefore, does not necessarily mean 
a return to normal levels, i.e., to an intake 
sufficient to maintain normal body weight 
and growth. Prolonged observations in 
other desalivate animals indicate that a 
constant weight loss is sustained under 
these conditions of feeding. 

By far the most striking effect of desali- 
vation is the large increase in water con- 
sumption. Following an initial depression 
on the first postoperative day, water intake 
has increased by more than four times the 
preoperative value on the 18th postopera- 
tive day and shows comparatively little 
change thereafter. 

While the data of Figure 1 indicate the 
general changes in food and water con- 
sumption consequent on desalivation, they 
do not accurately reflect, individual per- 
formance, as may be seen by inspection of 
Figure 2 which gives the individual data on 


three animals as examples. Here, both food 
and water intake curves (with the excep- 
tion of C) show sharp rather than gradual 
increases. The pattern of food and water 
consumption seen in examples A and B of 
Figure 2 was typical for 8 out of the 10 
animals, with water intake increasing from 
1 to 3 days prior to an increase in food con- 
sumption. In the first of the two exceptions 
to this pattern (C of Figure 2), water in- 
take increased sharply, but was not fol- 
lowed by a similar, abrupt increase in food 
intake. In the second exception, one animal 
failed to show abrupt or gradual increases 
in either food or water intake and subse- 
quently died on the 28th postoperative day, 
having sustained a 45% loss of body weight. 

Also apparent in Figure 2 are wide indi- 
vidual differences in the length of the post- 
operative interval which precedes the in- 
creases in food and water intake. The hash 
marks at the top of the water intake record 
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Fra. 2. Individual records for three animals of Experiment 1. Note that zero on the food 
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intake record is raised from the base line. 


in Figure 2 indicate the number of animals 
exceeding double their mean preoperative 
water intake on that day. Once an animal 
had achieved this level of water intake, it 
fully recovered eating and drinking, and its 
water intake remained permanently ele- 
vated. “Recovery” was defined to mean 
surpassing twice the mean preoperative 
water intake, and the recovery interval 
ranged from 2 to 18 days, except for the 
single animal which failed to recover. The 
length of the recovery interval was not 
correlated with the sex, preoperative body 
не or preoperative food and water in- 
take. 

When recovery had occurred, every ani- 
mal demonstrated the same stereotyped 
feeding behavior, which consisted in spend- 
ing 10 or 20 seconds at the food cup, ap- 


proaching and drinking from the water 
spout for 30 seconds or so, and then re- 
turning for another mouthful of food. Ani- 
mals would maintain this shuffling back 
and forth between food сир and drinking 
spout for several hours at a time, inter- 
rupted only infrequently by short periods 
of inactivity. 

Other desalivate animals on a diet of 
pellets, showed the same feeding pattern, 
and some few of these were observed to 
crouch near the drinking spout with a pellet 
clutched in their forepaws, alternately nib- 
bling the pellet and licking at the drinking 
spout. On the pellet diet, desalivate animals 
are very inefficient eaters, consuming only 
a fraction of the fragments chewed from 
the pellets, while letting most drop through 
the cage floor. Desalivate animals also fail 
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to groom effectively, with the consequence 
that their fur soon becomes soiled and 
rough, and dried food cakes about the 
mouth and beneath the nails of the fore- 
paws. 

This peculiar and consistent change in 
feeding habit produced by desalivation, 
suggests that the increased water intake 
observed in these animals results from the 
use of water as an exogenous saliva to fa- 
cilitate the swallowing of dry food. It also 
appears likely that animals must learn to 
use water in this manner, learning being 
suggested primarily because of the sharp, 
rather than gradual, increase in food and 
water intake typically seen during recovery 
and by the fact that the ‘particular “re- 
covered” feeding pattern is never seen prior 
to recovery and is never absent after re- 
covery. It is, at any rate, difficult to ac- 
count for the wide variability in the post- 
operative interval preceding recovery on 
the basis of factors other than learning. 


EXPERIMENT 2. EFFECTS or DIET Con- 
SISTENCY ON THE WATER INTAKE 
or DESALIVATE Rats 


If the inereased water intake of desali- 
vate rats does in fact result primarily from 
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Fi. 3. The effects of diet consistency on water 
intake in normal and desalivate rats. “Dry” refers 
to powdered, dry diet, and “2:1 mash" refers to а 
wet-mash diet, consisting of 2 parts water to 1 part 
dry diet. 
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the use of water as an exogenous saliva, 
then water intake in these animals should 
be reduced when they are allowed a diet 
which offers less of a problem to degluti- 
tion. In testing this, the food and water 
consumption of five normal and five de- 
salivate (35 days postoperative) animals 
were measured on a dry (powdered) diet, a 
wet-mash diet (2 parts water, 1 part dry 
diet), and during water deprivation. There 
were three females and two males in each 
group, and all animals were maintained on 
the wet-mash diet for 2 weeks prior to the 
observation period. 

The results are presented in Figure 3, 
where mean values for each group are 
given. For the first 4 days on the dry diet, 
water consumption for the desalivate group 
is more than three times that of normals, 
while the food intake of normals is better 
than twice that of the desalivates. When 
allowed the wet-mash diet (Days 5-7), the 
water intake for desalivates falls to zero 
and to near zero for the normals. Food in- 
take for both groups on the wet-mash diet 
is almost identical. During water depriva- 
tion (Days 11-13), food consumption for 
both groups drops, but the drop is much 
more profound for the desalivate group, 
being only 2.4 gm. on the first day of dep- 
rivation and zero for the remaining 2 days. 

On a dry diet then, desalivate animals 
appear to require large amounts of water to 
accomplish deglutition and therefore are 
unable to eat in the absence of water. They 
require no water when the consistency of 
the diet is such that it may be swallowed 
with little difficulty. The constant discrep- 
ancy between normals and desalivate ani- 
mals with respect to dry diet intake is 
directly attributable to the difficulty in 
swallowing dry food in the absence of sa- 
liva. The earlier described feeding pattern 
requires desalivates to expend a great deal 
more energy than normals in ingesting the 
same amount of food. The increased energy 
expenditure undoubtedly contributes to the 
failure to recover preoperative body weight 
observed in Experiment 1, even though food 
intake returns to preoperative levels. 

Many aspects of desalivate feeding be- 
havior thus far described bear a striking 
resemblance to the behavior of animals re- 
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covered from the adipsia and aphagia fol- 
lowing lateral hypothalamic lesions (Teitel- 
baum & Epstein, 1962). Describing the 
feeding pattern of Stage IV (recovery) ani- 
mals, Teitelbaum and Epstein (1962) noted 
that, “Shortly after eating a little dry food, 
they drank a little water, ate more food, 
drank more water, and so on... [p. 81]." 
They described one animal which *'. . . would 
stand near the drinking tube holding a pel- 
let in its front paws, munch on the pellet 
for a few minutes, lick actively at the wa- 
ter, and then go back to chewing on the 
pellet [p. 81]." They further comment that 
this animal “. . . would waste as much if not 
more food than it actually ate by dropping 
small chunks of it through the mesh floor 
of its cage [p. 81]." The lack of grooming 
and consequent soiling and matting of the 
fur are also described as consistent features 
of the behavior of animals with lateral hy- 
pothalamie lesions. 


EXPERIMENT 3. FACTORS AFFECTING THE 
Recovery or EATING AND DRINKING IN 
DESALIVATE ANIMALS: SIMILARITIES 
TO ANIMALS WITH LATERAL 
HYPOTHALAMIC LESIONS 


The depression of food and water intake 
seen in desalivate animals prior to recovery 
can be made to resemble quite closely the 
typical aphagia and adipsia seen following 
lateral hypothalamic lesions, merely by 
changing the type of diet and drinking 
spout available to the animal. A prolonged 
postoperative recovery interval, or a fail- 
ure to recover altogether, has been observed 
when glass spouts and a powdered diet are 
used, while metal spouts and a pellet diet 
favor rapid recovery. As demonstrated in 
Experiment 1, the use of metal spouts in 
combination with a powdered diet results 
in a fairly even distribution of recovery in- 
tervals, up to the limit survival time. 

The effects of using different types of 
drinking spouts and different types of diet 
can be seen in Figure 4, which gives the 
mean body weight (as percentage of origi- 
nal body weight) and food and water in- 
take values for two groups of animals (two 
females, one male per group), one group on 
metal spouts and one group on glass spouts. 
Both groups were on the powdered diet 


throughout the experiment, except that the 
metal spout group was switched to a diet 
of pellets for the first 4 days following de- 
salivation. 

As may be seen in Figure 4, the group on 
metal spouts and a pellet diet shows im- 
mediate recovery of eating and drinking, 
with the typieal, excessive water intake 
quite evident on the second postoperative 
day. Note also that the transition to the 
powdered diet on Day 5 is accomplished 
with no difficulty. (On the pellet diet, de- 
salivate animals scatter much food outside 
the саде and beyond the limits of the col- 
lecting pans beneath. The food intake val- 
ues for the 4 days on the pellet diet, there- 
fore, are spuriously high, since it was 
impossible to recover all of the uneaten 
food). 

The glass-spout group, on the powdered 
diet throughout, shows a severe depression 
of food and water intake following desali- 
vation. The depression of both food and 
water intake is much more pronounced in 
these animals than in the animals of Ex- 
periment 1, food intake being essentially 
zero for the first 14 days following desali- 
vation. On Day 8, when it appeared that 
none of the animals in the glass-spout group 
would recover, each animal was given 6 gm. 
of milk chocolate (Hershey's), and subse- 
quently received the same ration on Days 
9, 10, and 12. All three animals readily 
seized and chewed the milk chocolate as 
soon as it was placed in the cage, never 
taking more than 15 minutes to consume 
the entire ration. Despite this added ration, 
one animal died on Day 14 at 63% of its 
original body weight. On the following day 
(Day 15), the remaining two animals 
showed the abrupt inerease in food and 
water intake typical of recovery (compare 
А and B of Figure 2). 

The discrepancy in the interval to re- 
covery observed in these two groups of ani- 
mals ean again be attributed to mechanical 
diffieulties in eating and drinking which are 
a consequence of desalivation. In addition 
to the diffieulty in swallowing dry food, de- 
salivate animals have some diffieulty in 
drinking from glass spouts, largely because 
of the smaller bore, but also because the 
water meniseus is recessed in a concavity 
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Ета. 4. Factors affecting recovery. Open circles, metal spout group; filled circles, glass-spout group. 
Bar with arrows in food intake record indicates the switch to pellet diet for the metal-spout group for 
the first. 4 days following desalivation. The underscored data points in the food record indicate days on 
which the glass spout group received a 6-gm. ration of milk chocolate. The arrow indicates the death 
of one of the animals in the glass-spout group. Note that zero on the food intake record is raised from 


the base line. 


which results from fire polishing the tip. In 
many cases, desalivate animals provided 
with glass drinking spouts have been ob- 
served to liek at the tube until the meniscus 
receded and no longer contacted the tongue, 
at which time the animals would thrust 
their lower incisors into the tube, drawing 
the meniscus down to a point where it again 
could be licked. These animals were quite 
proficient in obtaining water in this manner, 
Prior to recovery, desalivates would lick at 
the glass tubes until the meniscus receded 
from contact with the tongue, continue to 
lick for some time without obtaining water, 
and then begin to paw and bite the tube 
until these disturbances shook the meniscus 
down to a point where the water was again 


accessible. Metal spouts, with the larger 
bores, afforded no obstacles to drinking, 
and desalivate rats obtain water from these 
as readily as normals. 

As for the effects of the diet type, pellets 
are probably more readily eaten than the 
powdered diet for two reasons. First, the 
larger pieces chewed from the pellets have 
less surface area than an equivalent amount 
of powdered diet and consequently require 
less moisture or lubrication to be swal- 
lowed. Secondly, pellets may be brought to, 
and eaten in, close proximity of the drink- 
ing spout, thus obviating the necessity of 
shuttling back and forth between food cup 
and drinking spout required on the pow- 
dered diet. If desalivate animals do, in fact, 
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have to learn to use water as an exogenous 
saliva in order to eat, it might reasonably 
be expected that eating in proximity to the 
drinking spout would favor quicker learn- 
ing and thus more rapid recovery. 

It appears clear that factors which make 
it difficult to ingest food and water (glass 
spouts and powdered diet) prolong the in- 
terval to recovery, and factors which make 
food ingestion less difficult (metal spouts 
and pellet diet) shorten the recovery in- 
terval. In terms of the comparatively wide 
range of recovery intervals observed in Ex- 
periment 1, the present experiment indi- 
cates that only slight changes in procedure 
have the effect of favoring either end of 
that distribution, i.e., immediate recovery 
or a failure to recover with consequent 
death from starvation and/or dehydration. 

We may note here a further similarity 
to lateral hypothalamie animals in the 
ready acceptance of “palatable” food (milk 
chocolate) while “rejecting” dry laboratory 
diet. Such behavior is characteristic of ani- 
mals in the Stage II (anorexia-adipsia) 
phase of recovery from lateral hypotha- 
lamie lesions (Teitelbaum & Epstein, 1962). 
As regards the Stage I (aphagia-adipsia) 
phase, it does not seem unlikely that on the 
glass-spout-powdered diet regimen as many 
desalivate animals would die from a íail- 
ure to recover eating and drinking as do 
animals with lateral hypothalamie lesions. 

The extraordinary degree of similarity 
between the feeding patterns of recovered 
desalivates and recovered lateral hypotha- 
lamie animals and the fact that under cer- 
tain conditions desalivation, as lesions in 
the lateral hypothalamus, leads to death 
from starvation and/or dehydration, 
strongly suggest that at least one of the 
effects of lateral hypothalamic lesions is a 
Severe reduction in salivary flow. Since the 
Salivary glands are under autonomic con- 
trol and autonomie effects are widely elic- 
ited by hypothalamic stimulation (Hess, 
1948), it seems not unlikely that lesions in 
the hypothalamus could affect salivary 
flow. 

The complete aphagia and adipsia fol- 
lowing lateral hypothalamic lesions, of 
course, cannot be explained entirely on the 


basis of a reduced or absent salivary flow, 
Lateral hypothalamie animals will fail to 
ingest food and fluid of any description and 
will die of starvation if not maintained 
through the immediate postoperative period 
by gastric feeding (Teitelbaum & Epstein, 
1962). Desalivate animals, on the other 
hand, show similar behavior only under 
certain conditions, and failure to recover 
eating and drinking is generally rare. 


Experiment 4. WATER INTAKE OF 
DESALIVATE ANIMALS DURING 
Foop DEPRIVATION 


In order to assess the effects of desaliva- 
tion on water intake in response to normal 
dehydration and to avoid the interaction of 
drinking and food intake, water intake was 
measured during food deprivation in four 
groups of eight animals each, prepared as 
follows: normal controls (four males, four 
females), desalivates (four males, four fe- 
males), a parotid group, in which only the 
parotid ducts were ligated (seven females, 
one male), and a submaxillary group in 
which only the submaxillary and major 
sublingual ducts were ligated (seven fe- 
males, one male). The mean body weights 
of the four groups were within 2% of 250 
gm. at the start of the experiment, and all 
operations had been performed at least 30 
days earlier. The procedure simply involved 
recording the water consumption at 12-hour 
intervals over a 72-hour period in the ab- 
sence of food. 

The results (Figure 5) demonstrate a siz- 
able difference in water intake between nor- 
mals and desalivates and a large, but lesser, 
difference between parotid and submaxil- 
lary groups. While both the parotid and 
control groups begin drinking during the 
first 12 hours of food deprivation, the sub- 
maxillary group delays drinking for 12 
hours, and the desalivates take no water 
for the first 24 hours. It might be argued 
that the difference between normals and 
desalivates results from the fact that the 
desalivates, because of their excessive wa- 
ter intake on dry-diet conditions, are over- 
hydrated at the start. This might account 
for the 24-hour delay in drinking, but even 
when desalivates do begin drinking there is 
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Ета. 5. Water intake during food deprivation for animals with parotid ducts ligated (labeled ‘‘parotid’’), 
submaxillary-major sublingual ducts ligated (labeled “sub-max”’), desalivates, and normals. 


a clear difference in rate of water intake 
between desalivates and normals (compare 
the difference in slopes). That the differ- 
ences in water intake result from loss of 
saliva and not from initial differences in 
hydration is further evidenced by the fact 
that there is a severe depression in water 
intake for the submaxillary group, which 
shows no increase from normal intake on 
dry-diet conditions. On the other hand, the 
parotid group does show a slightly elevated 
water intake on a dry diet, yet shows little 
difference from normals in water consump- 
tion during food deprivation. Note also that 
once drinking has begun the rate of drink- 
ing is much the same for the parotid and 
normal groups, while a comparable simi- 
larity in rate is observed between desalivate 
and submaxillary groups. 


The difference in water intake between 
the normals and desalivates is just opposite 
to that expected by the xerostomic theory, 
and it is clear that water intake under these 
conditions is determined primarily by fac- 
tors other than a dryness in the mouth. It 
is equally clear that the presence or ab- 
sence of saliva does influence water intake. 
Furthermore, the type of secretion present, 
submaxillary-major sublingual or parotid, 
is important, suggesting that composition 
rather than volume of saliva is responsible 
for the effect, since these two secretions do 
differ markedly in ion content. In particu- 
lar, parotid secretions are high in Na and 
low in K, while just the reverse is true for 
the submaxillary-major sublingual secre- 
tions (Schneyer & Schneyer, 1959a, 19591). 
It is possible that a changing concentration 
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of one or both of these ions normally ac- 
companies dehydration and produces a 
change in taste receptor output which then 
serves as a stimulus to drink. By way of 
example, suppose K ion to be the respon- 
sible factor, Reducing the K concentration 
of the oral secretions by ligation of the sub- 
maxillary-major sublingual ducts severely 
depresses water intake, while water inges- 
tion is little affected by parotid duct liga- 
tion which would result in very little change 
in the K concentration of the remaining 
secretions. There are, of course, many other 
salivary constituents which might affect 
taste receptor output, and thereby water in- 
take. The point is that the results strongly 
suggest that such a mechanism is involved 
in normal water balance. 


EXPERIMENT 5. WATER INGESTION IN 
RESPONSE TO AN ORALLY AD- 
MINISTERED SODIUM LOAD 


The extent to which salivary secretions 
are involved in “sodium thirst” was inves- 
tigated by comparing the water intake of 
desalivates and normals in response to an 
increase in dietary Na. Five desalivate ani- 
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mals (60 days postoperative) and five nor- 
mals were maintained on a wet-mash diet 
(2 parts liquid, 1 part dry diet) for 3 con- 
secutive days under each of three condi- 
tions, viz., when the liquid component of 
the diet was (a) water, (b) 2% NaCl, and 
(c) 5% NaCl. Drinking water was present 
throughout the observation period, and in- 
take was recorded daily. 

Figure 5 presents the mean daily water 
intake for each group, for each of the three 
conditions. As may be seen, there is no dif- 
ference between normals and desalivates in 
the amount of water ingested under any of 
the three conditions, and it may be con- 
cluded, therefore, that salivary secretions 
are not important in water ingestion in re- 
sponse to an orally administered Na load. 


EXPERIMENT 6. WATER INGESTION IN 
RESPONSE TO AN [NTRAPERI- 
TONEAL Soprum LOAD 


As a double check and to obviate taste 
factors, water ingestion in response to intra- 
peritoneal Na load was measured in the 
same animals used in Experiment 5. Follow- 
ing a control period of Yê hour, each animal 
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Fig. 6. Water intake for normals and desalivates in response to an orally administered Na load. 
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Fic. 7. Water intake for normals and desalivates in response to an intraperitoneal Na load. 


received a 20 ml./kg. load of 3% NaCl solu- 
tion IP, and the water intake was followed 
for a 2-hour postload period. No food was 
present during the observation period, and 
all animals were food and water satiated 
at the start of the experiment. 

The cumulated water intake for both 
groups is given in Figure 7. Again, no ap- 
preciable differences are observed between 
desalivates and normals, either with respect 
to latency of drinking or total volume of 
water ingested, It may be concluded, there- 
fore, that neither volume nor composition 
of saliva is involved in sodium thirst. 

This finding clearly indicates that water 
ingestion in response to hypernatremia is 
mediated by factors other than those re- 
sponsible for the depression of water intake 
observed in Experiment 4, since any periph- 
eral control, such as taste receptor dis- 
charge, is apparently excluded in sodium 


thirst. From the number and variety of ob- 
servations indicating that a hypothalamic 
osmoreceptor system is involved in water 
ingestion (Anderson, 1953; Anderson & Mc- 
Cann, 1955a, 1955b; Jewell & Verney, 1957; 
Cross & Green, 1959), it seems probable 
that sodium thirst is mediated largely by 
central rather than peripheral factors. 
The degree to which central versus pe- 
ripheral factors are involved in normal 
water balance is not clear. It should be 
pointed out, however, that peripheral fac- 
tors appear quite important under condi- 
tions of normal dehydration (Experiment 
4), whereas central (or at least, extraoral) 
factors are prepotent in sodium thirst which 
accompanies a degree of hypernatremia not 
seen in mild dehydration. The question is 
further complicated by the fact that pe- 
ripheral changes, though they appear un- 
likely, cannot be entirely excluded in so- 
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dium thirst, since taste receptor response 
could still be modified in the absence of 
saliva by mucous and/or minor sublingual 
secretions, or through changes reaching the 
receptors directly via the vascular system. 
This point will be considered in more detail 
later. 


EXPERIMENT 7. EFFECTS or DESALIVATION 
ON THE PREFERENCE FOR SODIUM 
CHLORIDE SOLUTIONS 


Substances for which the white rat nor- 
mally shows a preference are well suited to 
assess any effects desalivation may have on 
gustatory output, since such preference be- 
havior appears to depend largely on the 
stimulus properties of the substance (Bare, 
1949), preference being completely abol- 
ished when afferent gustatory impulses are 
interrupted by peripheral nerve section 
(Richter, 1939) or by coagulation of the 
gustatory nuclei of the thalamus (Oakley 
& Píaffmann, 1962). Alterations in taste 
preference following desalivation, then, may 
reasonably be attributed to modification in 
afferent gustatory discharge. 

The normal preference for NaCl (sodium 
chloride) solutions in the white rat (Bare, 
1949) was selected as a representative taste 
preference on which to assay the effects of 
desalivation. Four groups of animals were 
prepared as in Experiment 4: a normal con- 
trol group (six animals), a desalivate group 
(four animals), a parotid group with only 
the parotid ducts ligated (five animals), and 
a submaxillary group with only the sub- 
maxillary-major sublingual ducts ligated 
(five animals). All animals were females. 
Operations were performed on the same day, 
and the animals were tested on an ascend- 
ing series of NaCl solutions of 0.1, 0.3, 0.6, 
0.9, 1.2, 1.5, and 2.0 gm.% beginning on the 
4th postoperative day. Two bottles, one tap 
water and one NaCl solution, were affixed 
to the front of each cage, and each concen- 
tration was presented for 48 hours, with 
the bottle positions being switched every 
24 hours to correct for position preference. 
The animals were allowed pellets ad lib. 
throughout the experiment. 

Figure 8 gives the mean 48-hour intake 
of NaCl and total fluid (NaCl plus water) 
for each of the four groups. Desalivate ani- 
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Fie, 8. NaCl preference in animals with parotid 
ducts ligated (labeled “parotid”), submaxillary- 
major sublingual ducts ligated (labeled “sub- 
max”), desalivates and normals. Preference test 
was begun on the fourth postoperative day. 


mals show an extraordinary enhancement 
in NaCl preference, while the parotid group 
shows a lesser, but still sizable, increase in 
NaCl intake. The submaxillary group shows 
no difference from normal. 

It is clear from these effects that the 
presence or absence of saliva is of considera- 
ble importance in regulating the amount of 
NaCl an animal will ingest. Exactly what 
is responsible for this effect is not clear, but 
it should again be noted that the secretions 
of the parotids and submaxillary-major 
sublinguals are markedly different in ion 
content. In this case, it appears possible that 
the increased NaCl intake results from a 
decrease in Na concentration in the oral se- 
eretions. Thus, ligating the parotids leads 
to a large reduction in salivary Na and a 
large increase in NaCl intake, while ligating 
the submaxillary-major sublingual ducts re- 
sults in only minor salivary Na reduction 
and no change in NaCl preference. Note 
also that in terms of the effects on water 
intake in the absence of food (Figure 5), 
the effects of partial desalivation are just 
opposite to the effects on NaCl intake, i.e., 
parotid duct ligation has only a slight effect 
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Ета. 9. NaCl preference for the same groups of animals as in Fig. 7. but for preference test begun 
on the 30th postoperative day. Labels as in Fig. 7. 


on water intake, but enhances NaCl intake 
considerably, whereas ligating the submax- 
illary-major sublingual ducts has a pro- 
nounced effect on water intake and no ef- 
fect on NaCl intake. 

Another interesting effect of desalivation 
on NaCl intake was discovered when the 
preference was again tested in the same 
animals beginning on the 30th postoperative 
day (ascending series of concentration of 
0.1, 0.4, 0.9, and 1.4 gm.%). As is evident 
in Figure 9, the exaggerated NaCl prefer- 
ence seen in desalivate animals during the 
immediate postoperative period, is almost 
completely absent when tested 30 days 
postoperatively, and the preference for the 
parotid group is now no different from con- 
trols, disregarding differences in total fluid 
intake. It should be noted in Figures 8 and 
9 that the normal preference curves for the 


controls appear flat due to the greatly ex- 
panded ordinate. } 

To get a clearer picture of the change in 
the NaCl intake with time, the daily intake 
of water and a single NaCl solution (1 
gm.%) were followed in three animals 
(males, 285-325 gm.) continuously through- 
out an extended postoperative observation 
period. As in the previous preference tests, 
bottle position was changed every 24 hours, 
and pellets were allowed ad lib. 

Figure 10 presents the mean daily intake 
of water and NaCl for the three animals. 
Intake of NaCl shows a striking increase 
following desalivation, reaching a maximum 
on the 4th postoperative day and then de- 
clining until water and NaCl intake are 
about equal on the 50th postoperative day. 
The fact that the preference for NaCl un- 
dergoes such striking and rapid changes 
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Fic. 10. The course of the NaCl preference following desalivation. 


during the immediate postoperative period 
makes it diffieult to interpret the preference 
curves of Figure 8 where changes in ргеї- 
erence are confounded with changes in con- 
centration. 

'The rapid inerease and gradual decrease 
of preference for NaCl in desalivate ani- 
mals has been investigated under a variety 
of conditions and is a consistent finding. 
In desalivate animals, the preference for 
NaCl, once having diminished to a point 
where no preference is evidenced, has never 
been observed to return to normal, even 
after long periods during which animals are 
allowed access to water only. 

The cause of these effects is obscure, but 
on the surface desalivation appears to lead 
to а temporary modification of the taste re- 
ceptor response which gives rise to an en- 
hancement of preference for NaCl and later 
leads to a more permanent change of re- 
ceptor response which gives rise to no pref- 
erence. How these changes are accomplished 


-is not at all clear. Desalivate animals in 


which the NaCl preference is no longer evi- 


dent do reject NaCl in higher concentrations 


(2%) as do normals so that receptor sensi- 
tivity to Na remains. While the initial 
increase in NaCl intake appears related to 
a decrease in the Na concentration of the 
oral secretions, it is not clear why or how 
this should lead to an eventual disappear- 
ance of the preference. 


EXPERIMENT 8. EFFECTS or DESOXYCORTI- 
COSTERONE ACETATE ON THE SODIUM 
CHLORIDE INTAKE OF NORMAL 
AND DrsaLIVATE Rats 


Normal rats given injections of desoxy- 
corticosterone acetate (DOCA) will in- 
crease their selection of NaCl, even though 
DOCA promotes Na retention and elevates 
serum Na (Richter, 1943). Since salivary 
Na content is inversely proportional to 
DOCA titer (Goding & Denton, 1956; 
Frawley & Thorn, 1951) and since Experi- 
ment 7 indicated that NaCl intake may be 
inversely related to salivary Na, it seemed 
likely that the effects of DOCA on NaCl 
preference result from the reduction in sal- 
ivary Na produced by this hormone. This 
being the case, there should be no increase 
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Fro. 11. The effects of DOCA on the NaCl intake of normal and desalivate rats. The addition of NaCl 
to the diet was made by replacing water with the indicated concentrations of NaCl solution in mixing 
the wet mash (2 parts liquid, 1 part dry diet). The animals received daily subcutaneous injections of the 


indicated amounts of DOCA. 


of NaCl intake in desalivate animals fol- 
lowing treatment with DOCA. 

To test this notion, six animals—three 
desalivates (40 days postoperative) and 
three normals—were allowed a choice be- 
tween tap water and 1.0 gm.% NaCl solu- 
tion, and the effects of DOCA (Renalcort, 
Sonorol Labs.) on NaCl intake were ob- 
served. All animals were females and were 
maintained on a wet-mash diet (2 parts 
water, 1 part dry diet) to minimize differ- 
ences in total fluid intake. 

The results demonstrate (Figure 11) that 
DOCA increases the NaCl intake of de- 
salivates by more than twice that of normal 
animals. The difference between normals 
and desalivates is preserved even when 
NaCl is added to the diet (see Figure 11). 
Note that additional dietary Na also in- 
creases water intake, but does not depress 
NaCl intake. When DOCA treatment is 
discontinued, the NaCl intake for both 
groups falls promptly to zero. 

These results are of considerable interest, 


but unfortunately tend to obscure even 
more the question of how saliva influences 
the intake of NaCl. Changes in salivary Na 
concentration are excluded in desalivate 
animals, yet the exclusion of saliva leads 
to an enhanced NaCl intake with respect 
to normals. A consideration of possible ex- 
planations for this effect will be deferred 
until later. 


EXPERIMENT 9. EFFECTS OF ADRENALEC- 
TOMY ON THE INTAKE OF SODIUM 
CHLORIDE AND SUCROSE IN DE- 

SALIVATE RATS 


Since saliva can so potently alter the 
NaCl intake in normal animals, it is of 
interest to consider the possibility that 
changes in salivary composition account 
for the inereased Na intake seen in adrenal- 
ectomized rats (Riehter & Eckert, 1938). 
In assessing this possibility, the NaCl in- 
take of desalivate animals was determined 
both prior to, and following, bilateral re- 
moval of the adrenal glands. Sucrose intake, 
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which Rice and Richter (1943) found to 
decrease in adrenalectomized rats, was also 
measured prior to and following adrenalec- 
tomy. To avoid the problems associated 
with the disparate total fluid intake of de- 
salivates and normals on a dry diet, NaCl 
and sucrose were presented singly for 7 
hours each day in the absence of food, and 
the animals were maintained on pellets and 
water for the remaining 17 hours. A 1% 
NaC! solution was presented for 7 hours 
on each of the 3 successive days, followed 
by a similar presentation of a 10% sucrose 
solution, also for 3 consecutive days. The 
same sequence was repeated beginning on 
the 4th day succeeding bilateral extirpa- 
tion of the adrenals, with the exception that 


а 1% NaCl solution was substituted for 


water during the 17-hour maintenance pe- 
riod when sucrose intake was observed. Ten 
female animals served as subjects, 5 desali- 
vates (43 days postoperative) and 5 nor- 
mals. Two desalivate animals failed to sur- 
vive adrenalectomy. 

Figure 12 gives the mean 7-hour intake of 
NaCl and sucrose for the normals and sur- 
viving desalivates both prior to and follow- 
ing adrenalectomy. Preoperatively, desali- 
vates drink little NaCl, but both desalivates 
and normals increase their NaCl intake 
following adrenalectomy. The preoperative 
sucrose intake of desalivates is higher than 
that for normals, and unlike normals, the 
desalivates show only a slight decrease in 
sucrose intake following adrenalectomy. 
Desalivate animals, therefore, can and do 
increase their intake of NaCl in response to 
adrenalectomy, and changes in salivary 
composition thus do not appear important in 
mediating the increase. On the other hand, 
saliva does have an appreciable effect on 
the sucrose intake of desalivate and desali- 
vate-adrenalectomized animals. 


EXPERIMENT 10. Errects or DESALIVATION 
ox THE Quinine REJECTION THRESHOLD 


The quinine rejection threshold has been 
a popular assay for changes in the afferent 
gustatory system following lesions of the 
central nervous system (Ables & Benjamin, 
1960; Anderson & Jewell, 1957; Oakley & 


Pfaffmann, 1962; Patton, Ruch, & Walker, 
1944). Similarly, changes in the quinine re- 
jection threshold in desalivate rats can be 
used to assess alterations in afferent input 
from changes presumably localized at the 
receptor level. 

The quinine threshold was determined in 
five normals and four desalivates by the 
two-bottle method described in Experiment 
4. An ascending series of .00005, .0001, 
0005, .001, .005, .01, .05, and 10 gms. 
quinine hydrochloride was used, with each 
concentration being presented for 48 hours. 
The threshold was arbitrarily defined at 
75% rejection of quinine, i.e., when the 
percentage of quinine of the total fluid 
intake (quinine plus water) was 25% or 
less. 

As may be seen in Figure 13 the rejection 
threshold so defined is elevated in desali- 
vates by almost two log units over the nor- 
mal threshold. We may again conclude 
that a salivary influence on taste receptor 
output is important here, but there is in 
this case no obvious implieation of any 
specific salivary constituent responsible for 
the threshold change. 
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EXPERIMENT 11. EFFECTS or DESALIVATION 
on SUCROSE REJECTION IN PANCREA- 
TECTOMIZED Rats 


The influence of saliva on sucrose intake 
was investigated in connection with the re- 
jection of sucrose seen following pancrea- 
tectomy (Richter & Schmidt, 1941). The 
two-bottle technique was again employed, 
allowing a choice between tap water and a 
10% sucrose solution. Four female animals 
served as subjects, with the pellet diet avail- 
able ad lib. as before. Following a control 
period of 5 days, pancreatectomy was per- 
formed by IP alloxan injections. Depending 
on the invidual, a dose of from 100 to 200 
mg./kg. resulted in an almost complete re- 
jection of sucrose (Figure 14). When the 
animals were desalivated on the 19th day, 
there followed an immediate increase in su- 
crose intake. On the 23rd day, the ligatures 
were removed and sucrose rejection returns. 
On the 30th day, salivary flow was stimu- 
lated by pilocarpine injection (0.5 mg.) 
which resulted in a small, but definite, in- 
crease in sucrose rejection. Finally, desali- 
vation on the 34th day is once again ac- 


companied by a prompt inerease in sucrose 
intake. 

The sucrose rejection by panereatecto- 
mized rats appears to result exclusively 
from factors in the salivary secretions. Glu- 
cose appears a likely candidate for the re- 
sponsible salivary factor, since glucose 1s 
present in small amounts or not at all in 
the normal secretions of humans (Reid, 
Hawkins, & Pigman, 1955), dogs (Langley; 
Gunthorpe, & Beall, 1958), cats (Carlson 
& Ryan, 1908), and rabbits, (Jeangros, 
1928), but is regularly found in quantity 1n 
the saliva of individuals with diabetes melli- 
tus (Becker & Kestermann, 1936; Binet, 
1926), and in dogs treated with phloridzin 
(Pearee, 1916) or alloxan (Langley, Gun- 
thorpe, & Beall, 1958). 

If these observations also apply to the rat, 
then normal saliva would contain only small 
amounts of glucose, while the saliva of rats 
with diabetes mellitus would contain ap- 
preciable amounts of glucose. If we assume 
that sucrose preference is inversely related 
to the amount of salivary glucose, then de- 
salivation in normal animals would reduce 
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Fic. 14. The effects of desalivation on the sucrose preference of pancreatectomized rats. 


an already small glucose concentration to 
zero, with the result that the sucrose pref- 
erence would be only slightly increased 
(compare the preoperative sucrose intake 
values for normals and desalivates, Figure 
12). In the case of pancreatectomized rats, 
there would be a high concentration of glu- 
cose in the saliva bathing the receptors, and 
desalivation would again reduce this con- 
centration to zero with the consequent re- 
versal of preference. 

Glucose thus appears to be the most 
obvious salivary factor which might be 
involved in the modification of sucrose in- 
take, but the question of whether it is in 
fact the responsible agent remains to be 
seen, 


Discussion 


The effects of desalivation on food intake 
can be attributed entirely to the mechanical 


difficulty of swallowing dry food, which 
appears as an obvious consequence of the 
loss of saliva. The exaggerated water intake 
seen in desalivate animals when feeding on 
a dry diet appears to result entirely from 
the use of water as an exogenous saliva to 
moisten food and render it capable of in- 
gestion. Epstein, Spector, Samman, and 
Goldblum, (1964) have recently reported a 
similarly exaggerated water intake associ- 
ated with dry-diet feeding in partially 
salivarectomized rats. The point of particu- 
lar interest in the food intake of desalivate 
animals is the mode of food ingestion or 
feeding pattern, which, as has been noted, 
appears indistinguishable from that ob- 
served in animals recovered from lateral 
hypothalamie lesions. 

The explanation for alterations in intake 
of water (under food deprivation), NaCl, 
quinine, and sucrose are not immediately 
clear, but as has been repeatedly suggested 
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these alterations seem most probably the 
result of a modification of the peripheral 
gustatory response. While the nature of 
such modification remains obscure, it is 
significant to point out that it is confined 
to the periphery. It is generally held that 
peripheral taste receptor response remains 
invariant in the face of changes in the 
internal environment which are regularly 
associated with alterations in intake (Pfaff- 
mann, 1957). This view is supported largely 
by the failure to note changes in the inte- 
grated response of the chorda tympani in а 
variety of instances where the physiology 
of the internal environment has been altered 
in а manner known to affect taste prefer- 
ences (Hagstrom, 1959; Nachman & Pfaff- 
mann, 1963; Pfaffmann & Bare, 1950; 
Pfaffmann & Hagstrom, 1955). 'The method 
of recording employed in all of these ex- 
periments involves the deliberate exclusion 
of saliva, and it is therefore not surprising 
that the peripheral response appears in- 
variant under these conditions if modifica- 
tion of the receptor output results from 
changes in the composition of saliva. 

Тһе fact that DOCA treatment and 
adrenalectomy both result in an inereased 
NaCl intake in the absence of saliva ap- 
pears to be more in line with the current 
view of a fixed peripheral input with varia- 
bility in preference resulting from the 
modification of some central structures 
(Nachman & Pfaffmann, 1963; Pfaffmann, 
1957). The greater increase in NaCl intake 
in desalivate versus normal rats with DOCA 
treatment is not inconsistent with the in- 
terpretation that this hormone acts cen- 
trally, given that the desalivates already 
have some peripheral modification at the 
outset. 

In those cases where salivary influences 
are excluded, as in sodium thirst and the 
already noted instances of increased NaCl 
intake with DOCA and adrenalectomy, the 
peripheral response could still be modified 
by changes reaching the receptors via the 
vascular system or by changes in the compo- 
sition of the residual secretions remaining 
after desalivation. With regard to the latter 
possibility, it should be noted that the 
mouths of desalivate rats are moist, and 


the tongue and buccal membranes remain 
in excellent condition. Although our animals 
were not totally desalivate (minor sub- 
linguals intact), Montgomery (1931a) de- 
seribed a similar condition for the mouths 
of totally desalivate dogs. Mucous secre- 
tions thus appear to be present in quantity 
sufficient to prevent a severe drying of the 
mouth, and, indeed, 45% of the oral secre- 
tons in man are derived from mucosal 
glands alone (Sehneyer & Levin, 1955). 
There is, then, at least the possibility that 
taste receptor response сап be altered by 
an alteration in the composition of the 
mucous secretions remaining after desaliva- 
tion. 

The possibility of the peripheral taste 
receptors being influenced by changes in 
the vascular system appears unlikely, since 
such changes would apparently be evident 
in the integrated response of the chorda 
tympani even when saliva and other secre- 
tions are excluded and, as already noted, no 
such changes have been observed. The ques- 
tion is, given a single quality, e.8., NaCl, 
how accurate a measure is the purely quan- 
titative integrated response of the observed 
behavioral preference? The fact that aboli- 
tion of an estimated 85-90% of the total 
gustatory afferent input by peripheral nerve 
section alters the normal preference of the 
white rat for NaCl very little, (Pfafimann, 
1952; Richter, 1939) suggests that some 
variable other than magnitude of response 
may be more important in modifying taste 
preferences. 

It is possible that chemical changes in 
the extracellular environment, including 
changes in hormone titers, modify the pat- 
tern of responding of the taste receptors 
while altering the total nerve response very 
little. Such possibilities can only be assessed 
by recourse to single-fiber recording meth- 
ods. 

In conclusion we may state that the ob- 
servations on desalivate rats strongly indi- 
cate that peripheral control is important 1n 
regulating intake of food and fluid and that 
such control appears to result from а modi- 
fication of the peripheral taste receptor T°- 
sponse. It seems likely that such modifica- 
tion results in turn from changes in salivary 
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composition which can (a) serve as a 
stimulus acting directly on the taste cells 


and (b) modify the response of the taste 
cells to exogenous stimuli. 
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ON THE BIPOLARITY OF SEMANTIC SPACE 
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The assumption of Osgood and his associates that semantic space can be 
described by 3 generalized bipolar factors is called into question. The 
bipolarity model in particular is discussed, and a test of it was provided by 
having Ss rate a series of concepts on a single-adjective form of the semantic 
differential. Correlation and factor analytic results fail to yield support 
for a generalized, bipolar, and symmetrical model of semantic space. Some 
adjective pairs seem to have a greater likelihood of being functionally bi- 
polar than others, although there seems to be some sort of a concept-scale 
interaction involved in the existence of bipolarity. The implications of these 
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findings are discussed. 


IR ARTHUR EDDINGTON (1939), in one of 

his many discussions on the philoso- 
phy of science, has described a limitation 
within which most scientists must operate. 
This limitation is most vividly illustrated in 
Eddington’s “fishnet analogy.” Eddington 
describes a hypothetical instance in which 
a scientist, wanting to conduct an inves- 
tigation of the size of fish in the sea, weaves 
a fishnet with a 2-inch mesh. After careful 
and representative samplings of catches, 
the scientist measures each fish in his sam- 
ple and comes to the conclusion that “there 
are no fish in the sea under two inches.” 
The moral in this analogy is that a degree 
of “selective subjectivism” (Eddington, 
1939) is involved in the construction of our 
measuring instruments and this limits the 
knowledge obtainable with any such instru- 
ment. 

The purpose of this paper is to investi- 
gate the possibility of selective subjectivism 
in the description and construction of se- 
mantic space as represented by Osgood and 
his associates’ (Osgood, Suci, & Tannen- 
baum, 1957) hypothetical model of the 
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of Rochester College of Arts and Sciences for a 
grant to carry out these analyses. 


dimensionality of meaning. These authors 
assume that semantic space is bipolar; it 
is this assumption in particular that has 
given rise to the evaluation of their model 
reported in this paper. 

In addition to assuming that semantic 
space is bipolar, Osgood and his associates 
believe this space to be three dimensional, 
each dimension representing one of the 
three major factors of evaluation, potency, 
and activity along which meaning is said 
to vary. Other factors were obtained in 
their analyses, but have so far been con- 
sidered to be of insufficient importance and 
interpretability to warrant using them to 
describe semantic space. Each of the three 
major dimensions is represented by op- 
posite adjectives, and all three dimensions 
pass through a common origin at which 
point neutrality of meaning (i.e., meaning- 
less) presumably exists. 

Each of the three dimensions used to 
describe semantic space has been defined 
by several 7-point bipolar semantic differ- 
ential rating scales which, by means of 
factor analyses, were found to cluster to- 
gether. Thus, the evaluative dimension is 
represented by scales such as good-bad, 
pleasant-unpleasant, nice-awiul, and so on. 
The poteney dimension is characterized by 
strong-weak, heavy-light, and hard-soft. 
The third factor—the activity dimension— 
is depicted by such scales as active-passive, 
fast-slow, and agitated-calm. 


ASSUMPTION OF BIPOLARITY 


Osgood et al. (1957) have attempted to 
link their model of semantie space with 
learning theory. They hypothesize that 
some external stimulus or sign triggers off 
an internal representational mediation 
process (rm), which itself is stimulus-pro- 
ducing (rm — Sm). It is the mediational 
ra Which is believed to carry the meaning 
of the sign. Osgood et al. (1957) further 
link this mediational theory with their as- 
sumption of bipolarity. To quote: 


Corresponding to each major dimension of the 
semantic space, defined by a pair of polar terms, 
is a pair of reciprocally antagonistic mediating re- 
actions, which we may symbolize as rae: and Tmt 
for the first dimension, гп and Pau for the 
second dimension, and so forth. Each successive 
act of judgment by the subject using the semantic 
differential, in which a sign is allocated to one or 
the other direction of a scale, corresponds to the 
acquired capacity of that sign to elicit either rm 
or Ta, and the extremeness of the subject's judg- 
ment corresponds to the intensity of reaction as- 
sociating the sign with either rm or fm [p. 27]. 


Despite the theoretical reasoning of Osgood 
et al. in support of the assumption of the 
bipolarity of semantie space, whether or 
not individuals actually do in practice at- 
tribute meaning to signs along bipolar (ad- 
jectival opposites) continua remains an un- 
tested assumption. 

A study which comes close to testing the 
bipolarity assumption is one by Ross and 
Levy (1960), who were interested in deter- 
mining whether or not adjectival opposites 
had a similar range of usage. Subjects were 
given playing cards and instructed to make 
the “most beautiful" and “ugliest,” and 
“simplest” and “most complex," and the 
“commonest” and “most unusual" patterns 
possible. Ross and Levy found that, in 
comparison with their respective opposites, 
"ugly," “complex,” and “unusual” elicited 
a wider range of patterns. They concluded 
that “the postulation of antonyms as being 
of equal and opposite semantic force is a 
convenient but probably artificial assump- 
tion in the description of simple patterns 
[Ross & Levy, 1960, p. 187].” Though 
Ross and Levy do offer some weak evidence 
against the assumption that semantic space 
is bipolar in nature, their study is limited 
in the number of paired adjectives used. 
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Further, their procedure does not parallel 
what happens when an individual attrib- 
utes meaning to a concept. Instead of pre- 
senting subjects with a sign or concept and 
requiring them to indicate their mediational 
response (rm) as is done with the semantic 
differential, subjects in Ross and Levy's 
study were presented with the mediational 
response (i.e., the adjectives) and asked to 
reproduce a possible sign (ie. a playing 
card pattern). 


Reciprocally Antagonistic Aspect of Bi- 
polarity 


As indicated above, the assumption that 
semantic space is bipolar implies that an 
individual’s mediational responses to signs 
occur along a series of dimensions which 
have reciprocally antagonistic verbal op- 
posites at their polar points. Accordingly, 
when a subject is asked to rate a concept 
on a semantic differential scale, he is al- 
lowed to make only one check mark. Thus, 
he can respond with: 


good  : : 3 2 : X: bad 


or 


god :X: bad 


He cannot respond to a concept by checking 
both sides of the continuum as follows: 


good | MEX Ze pure 2; bad 


The assumption of reciprocal antagonism, 
like that of bipolarity, is built right into the 
measuring instrument itself and conse- 
quently leaves no room for disconfirmation. 

It should be noted that Osgood et al., 
even though they propose a theory mM 
which opposite pairs are reciprocally an- 
tagonistic, really allow the subject to make 
a response through which he can try to 
indicate that he believes both sides of the 
scale are appropriate for his response to 
the concept. Ratings which reflect this 
tendency are assigned to the middle cate- 
gory, Le, the 4 rating. Unfortunately, a 
rating in the 4 category can also indicate 
that the subject considers the concept to be 
neutral on the scale or that he feels the 
scale is completely irrelevant to the con- 
cept being rated. Unless the subject is asked 
to indicate which of the three reasons his 
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particular 4 rating reflects (a procedure, 
which for some reason apparently has not 
been suggested by anyone), the meaning 
of a rating in this category is ambiguous. 
The most relevant point here is that Os- 
good et al. seem to recognize that a subject 
is at times tempted to respond to a concept 
by using presumably mutually incompat- 
ible mediators. Despite this recognition, 
they nevertheless maintain their assump- 
tion on the bipolar and reciprocally an- 
tagonistic character of semantic space. 


Two Questions To BE STUDIED 

In studying the issue of bipolarity, two 
separate, though not unrelated questions 
emerge: (a) Is semantic space bipolar? (b) 
Assuming that semantie space is bipolar, 
have Osgood and his associates constructed 
semantie differential scales which accu- 
rately reflect this bipolarity? Each of these 
two issues is briefly outlined below. 


Bipolarity of Semantic Space 


Whether or not semantie space is truly 
bipolar (adjectival opposites) or unidirec- 
tional (single adjectival dimensions) is a 
question left open by Osgood et al. Stated 
in a slightly different way, the question 
becomes one of investigating the type of 
continuum along which individuals assign 
meaning to concepts. On the one hand, it 
seems logical to assume that individuals 
naturally think in terms of opposites and 
that any attempt to measure meaning 
should use a sealing procedure which takes 
into account this bipolarity. On the other 
hand, one ean think of adjectival dimen- 
sions used by individuals in assigning 
meaning to eoncepts which seem to be uni- 
direetional rather than bipolar. An example 
of such a unidirectional dimension would 
be the “loneliness” continuum. From the 
manner in which the adjective lonely is em- 
ployed in common usage, its referent ap- 
pears to range from no loneliness to max- 
imum loneliness and not from loneliness to 
the opposite of loneliness (whatever that 
might be). Because the loneliness dimension 
cannot readily be cast into a bipolar con- 
tinuum, it cannot be used as a semantic 
differential scale. Further, the extent to 
which such unidirectional adjectival dimen- 


sions are implicitly used by individuals in 
their representational mediational processes 
(ra's), the semantic differential places an 
artificial limit—or fishnet, if you will—on 
the measurement of meaning. 


Bipolar Characteristics of Semantic Dif- 
ferential Scales 


In a review of the semantic differential, 
Carroll (1959) has criticized Osgood et al. 
for their purely judgmental, and sometimes 
arbitrary, specification of the opposite pole 
of a scale, Osgood and his associates allude 
to this problem when they discuss the ne- 
cessity for scales to have the properties of 
polar opposites which mark off a linear 
dimension passing through an origin. They 
offer no evidence that their scales fulfill 
these criteria, but indicate: "At present 
we merely assume that the scales defined 
by familiar and common opposites have 
these properties, but research on the prob- 
lem needs to be done [Osgood et al., 1957, 
p. 79]." 

Several of the semantic differential scales 
are indeed defined by word pairs which are 
typically regarded as being "familiar and 
common opposites" (e.g., good-bad, large- 
small, happy-sad). In cases such as these, 
if a person reports that he is characterized 
by "the opposite of" one adjective, then we 
presume the other adjective almost without 
exception; if we are the opposite of happy, 
then we are sad. This all seems very neat 
until we recall that is not uncommon to 
see someone both laugh and cry at the 
same time and report feeling both happy 
and sad. If we are in one state, can we also 
be in the other and still maintain that they 
are opposites? Such may be possible if 
parallel and different reactions can go on 
to different aspects of the same situation at 
the same time within the same person. This 
issue will be diseussed in detail further on 
in this paper. The point to be made here is 
that we cannot afford to be too sure even 
of those “familiar and common" word pairs 
until further research has been done. 

There are other semantie differential 
scales set up by Osgood et al. which seem 
to have polar points not widely accepted as 
being clearly opposite in meaning (e.g. 
active-passive, rugged-delicate, sacred-pro- 
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fane). The bipolarity is not clear-cut in the 
sense that one polar adjective may imply 
another dimension which its opposing ad- 
jective does not. Thus, the opposite of pas- 
sive implies activity, but it may also imply 
aggressiveness. 

In still other scales, the assumption of 
bipolarity may be inappropriate because 
both polar scales imply some similar char- 
acteristic. An example of this, which is 
acknowledged by Osgood et al., is in the 
rugged-delicate scale. When presented in- 
dividually, both “rugged” and “delicate” 
possess similar positive evaluative char- 
acteristics. 

It should also be noted that some of the 
scales have undergone successive revisions, 
with the result being that succeeding scales 
have factor loadings on different dimensions 
of meaning. Thus, the results of one factor 
analysis indicate that calm-agitated has a 
primary loading on the evaluative dimen- 
sion and a secondary loading on activity 
(Osgood et al, 1957). In another factor 
analysis, when the scale was changed to 
calm-excitable, the loading on evaluation 
disappeared and the scale became more 
clearly aligned with the activity dimension. 

In general, then, it seems likely that the 
bipolarity assumption is particularly un- 
warranted for those scales where one of 
the adjectives implies certain characteristics 
which the other polar adjective does not 
negate. 


TESTS OF THE BIPOLARITY ASSUMPTIONS 


The assumption of bipolarity for semantic 
space, as well as for the individual semantic 
differential scales, has never adequately 
been investigated empirically. Because of 
the forced bipolar structure of the semantic 
differential scales, semantic space is arbi- 
trarily required to be bipolar. By using 
seales which do not force bipolarity, it 
would be possible to determine empirically 
whether or not semantic space actually 
(i.e., functionally) is bipolar. Thus, instead 
of using the standard semantic differential 
scale which is comprised of adjectival op- 
posites (e.g., good-bad), the connotative 
meaning of concepts can be measured by 
means of single-adjective scales (e.g., good 
and bad). This can be done by having in- 
dividuals rate a concept according to 


whether the scale is positively or negatively 
related to it, A negative relationship, ac- 
cording to the bipolarity assumption, should 
imply that the concept had the opposite 
characteristic from the adjective defining 
the scale. 

If the semantie differential scales, and 
therefore also semantie space, are actually 
bipolar, several correlational and factorial 
results should follow. For example, factor 
analyses of sets of single-adjective scales, 
each constructed from the same bipolar 
adjectival scale, should have equal but op- 
posite loadings on the same factor. To 
illustrate, consider the good-bad scale, 
which has been found to have a .88 loading 
on the evaluative factor (Osgood et al., 
1957). If the bipolarity assumption is valid, 
one would expect the good scale to have a 
loading of approximately .88 on the evalua- 
tive dimension and the bad scale to have 
mirror-image loading—that is, —.88. 

Semantic space obviously could be 
bipolar even if it should turn out that some 
(or even all?) of the semantic differential 
scales proposed by Osgood et al. are not 
bipolar. The presence of many scales that 
are not bipolar could result in equivocal 
findings. It is possible to classify the scales 
into two categories—those which seem to be 
composed of adjectives which are most 
obvious in their opposing meaning (e£; 
good-bad), and those in which the polar 
adjectives are least obvious in their op- 
posing meaning (eg, active-passive). If 
semantie space is bipolar, this bipolarity 
should be obtained more frequently when 
the most obviously opposite adjectives are 
used to define the single-adjective scales 
referred to above than when the least 
obvious are used. 


METHOD 


Single-Adjective ^ Semantic Differential 


Scales 


In order to test the assumption of bipolarity 
for semantic space in general, as well as for m- 
dividual scales, a semantic differential with single- 
adjective scales was devised. Most of the scales 
for this form of the semantic differential were 
selected from the adjectives defining the 50 bi- 
polar scales used by Osgood et al. (1957, р. 37) 
in their first factor analysis. 1 

Scale Selection. In selecting the adjectives to 
be used for the modified semantic differential, 
two judges (the authors) independently classified 
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the list of bipolar scales into two categories: most 
obviously bipolar seales and least obviously bi- 
polar scales. The method of classification was 
based on a judgment of the ease or difficulty 
with which the opposite pole could be predicted, 
given each one of the polar adjectives. For ex- 
ample, if “good” was easily or unambiguously 
predicted from “bad” and “bad” from "good," 
the good-bad scale was classified as being most 
obvious. If the prediction could not readily be 
made in both directions for a given scale (eg. 
fragrant-foul), the scale was considered least 
obvious. Only those scales on which both judges 
had complete agreement were retained in the 
two categories. From this list of most and least 
obvious bipolar scales, a second selection was made 
so that the two lists would be approximately equal 
in the number of scales with loading on the evalu- 
ative, potency, and activity dimensions. Conse- 
quently, the final dichotomy between most and 
least obvious opposites included some compro- 
mise between obviousness and factor representa- 
tiveness, The bipolar scales retained for use in 
the construction of the single-adjective semantic 
differential are presented in Table 1. The most 
obvious list has a total of 15 scales, and the least 
obvious has 14 scales. The additional adjectives 
excitable and excited were added to the least 
obvious list as possible alternate poles for agitated 
in the calm-agitated scale. Three possible opposites 
to calm were included in the hope of determining 
empirieally which, if any of them, were opposite to 
calm in semantic space. 

Format of the Single-Adjective Semantic Dij- 
ferential. Although the single-adjective scales used 
in this form of the semantic differential require 
that judgments be made on a single adjectival 
dimension, the instructions given to subjects were 
worded so as to make the ratings as comparable 
to the bipolar semantic differential as possible 
(cf. Osgood et al, 1957, p. 82 ff). The instruc- 
tions used for the modified semantic differential 
were as follows: 


The purpose of this study is to measure the 
meanings that some words have for you. We 
will ask you to judge them against a series of 
descriptive adjectives. In doing this, please make 
your judgments on the basis of what these words 
mean £o you. On top of each page of this book- 
let you will find the word to be rated, and be- 
low it adjectives on which to rate it. You are 
to rate the word at the top of each page on each 
of these adjective scales in order. Here is how 
you are to use these adjective scales: If you 
feel that the descriptive adjective (e.g. fair) 
is very positively related to the word at the 
top of the page, you should place your check 
mark as follows: 
fair 


TABLE 1 
Most лхо Least Onviovsty Brora Scares 
Usen rom тив SixGLE-ADJECTIVE. SEMANTIC 


good-bad 1-2 | tasty-distaste- 
large-small 34 ful 31-32 
hard-soft 5-6 | sharp-dull 33-34 
strong-weak | 7-8 | heavy-light 35-36 
clean-dirty 9-10 | sacred-profane | 37-38 
red-green 11-12) thick-thin 30-40 
pleasant-un- nice-awful 41-42 
pleasant 13-14) bright-dark 43-44 
happy-sad 15-16) bass-treble 45-46 
wet-dry 17-18 angular-rounded| 47-48 
long-short „| 19-20) fragrant-foul 49-50 
hot-cold 21-22) active-passive | 51-52 
honest*dishonest | 23-24) rugged-delicate | 53-54 
fast-slow 25-20) pungent-bland | 55-56 
wide-narrow 27-28 calm-agitated | 57-58 

alive-dead* 29-30 -(excit- 

able) -59 
-(exeited) -60 


* Although the alive-dead scale was not ас- 
tually included in the original factor analysis, 
it was rated by subjects as being appropriate to 
the Activity dimension (Osgood et al., 1957). 


If the descriptive adjective is quite positively 
related (but not extremely) to the word, you 
should place your check mark as follows: 


If the descriptive adjective seems only slightly 
positively related (but is not really neutral) to 
the word, then you should check as follows: 


If you consider the word to be neutral on the 
adjective scale, or not at all related to the word, 
then you should place your check mark in the 
middle space: 


In general, then, the right side of the scale 
refers to a positive relationship between the 
descriptive adjective and the word, and the 
left side of the scale refers to a negative re- 
lationship between the descriptive adjective and 
the word. The middle is the neutral point. 


IMPORTANT: (1) Place your check marks in 
the middle of spaces, not 
on the boundaries. 


X: : : x 


This Not this 
(2) Be sure you check every 
descriptive adjective for 
every word at the top of 
the page—do not omit any. 
(3) Never put more than one 
check mark on a single ad- 
jective scale. 
Do not try to remember how you checked similar 
descriptive adjective scales earlier. Make each 
item a separate and independent judgment. 
Work at a fairly high speed. Do not worry or 
puzzle over individual items. It is your first 
impressions, the immediate “feelings” about the 
items, that we want. On the other hand, please 
do not be careless, because we want your true 
impressions. 


Method of Administration. A form of the single- 
adjective scale semantic differential was adminis- 
tered to a group of 251 male and female under- 
graduates. They were instructed to rate the 
following concepts: LADY, SIN, SYMPHONY, ME, 
BABY, GOD, TORNADO, MOTHER, STATUE, and COP. These 
10 concepts were randomly selected from the 
list of 20 used by Osgood et al. (1957) in the 
same factor analytic study from which the single- 
adjective scales were constructed. 

To minimize the influence that the rating of 
a concept by an adjectival scale might have on the 
rating of the same concept with the opposite 
adjectival scale, the modified semantic differential 
was divided into two sections: the first section (A) 
contained one of the adjective pairs of both the 
most and least obvious scales, while the second 
section (B) contained the other member of each 
pair. The two sections were constructed so that 
each half contained an equivalent number of 
adjectival scales with loadings on the evaluative, 
potency, and activity dimensions (as determined 
by the factor analysis of these scales in the bi- 
polar format made by Osgood et al). Further, 
both Sections A and B of the semantic dif- 
ferential booklet were constructed so that each had 
an equivalent number of single-adjective scales 
with positive and negative loadings on each of 
the three factors. Thus, Section A had some scales 
which were positive in evaluation (e.g, happy, 
good, ete.) and others which were negative in 
evaluation (e.g, awful, unpleasant, etc.); Sec- 
tion B similarly had an equivalent number of 
scales with negative (e.g, sad, bad, etc.) and 
positive (e.g., nice, pleasant, ete.) loadings on the 
evaluative dimension. A similar matching pro- 
cedure was carried out for those adjectives with 
loadings on the potency and activity factors. 

Approximately half of the subjects received 
booklets with Section A followed by Section 
B; the remainder of the subjects received booklets 
with Section B following Section A. In addition, 
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the concepts and scales within each section were 
arranged in several random orders. 
Data Processing. 'The scale ratings assigned by 


each subject to each concept were coded from 1 
to 7, according to the position of the subject's 
check on the rating sheet; the value of 1 was 


given to the extreme left, and the value of 7 to 
the extreme right of the continuum. The ratings 
were punched on IBM cards twice, and the two 


decks were compared in order to correct any dis- 
crepancies. In a few instances subjects had omitted 
one or more ratings on a scale, or on a concept, 
When this happened on a single ale for a 


subject, we arbitrarily assigned a value of 4 to 
that scale. In general, if a subject omitted more 
than one rating on one scale, he also tended to 
omit a whole page or more. These subjects were 
eliminated from the study. There were 236 sets 
of ratings (i.e., subjects) kept for the study, with 
only 15 being lost because of numerous omissions. 

For each of the 10 concepts, two separate cor- 
relation matrices were computed—one for the 
set of scales whose adjectives were classified as 
being most obviously opposite in meaning, and 
the other for the adjectives categorized as least 
obviously opposite (see Table 1). In addition to 
these 20 correlation matrices, two more matrices 
were computed on the scores obtained by correlat- 
ing across concepts and subjects within each of 
the two classifications of scales. Pearson product- 
moment correlations were used in all instances. — 

Each of the 22 correlation matrices was Ш 
turn subjected to Thurstone’s complete centroid 
method of factor extraction. The highest cor- 
relation in each column was used as a commu- 
nality estimate for that variable; the commu- 
nality estimates were not reiterated. Factoring 
was stopped when the product of the two largest 
loadings on the last factor extracted failed to 
equal or exceed .04. This criterion for stopping 
apparently will normally result in too many, 
rather than too few, factors being extracted. The 
factors were rotated to the simple structure 
criterion as represented by the Verimax criterion 
of Kaiser (1958). All calculations were carried out 
on an IBM 7040. 

An indication of the kind of results we would 
get was evident by the number of factors ex- 
tracted from the various correlation matrices. 


This information is summarized in Table 2. 
RESULTS 
The results are too massive for us to 


present all the data we have. We are there- 
fore extracting illustrative and, we hope; 
representative data from among those which 
are most pertinent to answering the ques- 
tions under investigation here. The results 
of a correlational analysis of the data wil 
be presented and discussed first, after which 
we will deal with the factor analytic tests 0 
the bipolarity hypothesis. 


+ 
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TABLE 2 
Nvwszn or Factors ron Елсин Concert 


EE = = = 


Number of factors 
ve Most obvious Least obvious 

BABY 12 12 
ME 10 12 
BYMPHONY 9 8 
LADY 9 9 
SIN 9 7 
GOD 11 12 
STATUE 11 12 
cor 8 8 
TORNADO 10 9 
MOTHER 9 8 


Correlation 


There were a total of 9,750 correlation 
coefficients computed. It seems imperative 
to explore this tremendous mass of data to 
see what light it throws on the problems 
being investigated. 

For both correlation coefficients and fac- 
tor loadings, one has the problem of deciding 
how high the statistic should be before 
one is warranted in basing conclusions con- 
cerning theory on them. The problem is 
decided in part on the question of whether 
the value is significantly different from 
zero; it also has to be decided in part on 
the basis of there being sufficient covariance 
involved to admit at least the possibility 
that inferenees concerning it will be valid. 
In general, our interpretations of correla- 
tion coefficients will emphasize those that 
are .225 or above (ie. less than the .01 
level of significance). The usual т? inter- 
pretation of correlation implies that this 
cutoff accounts for as little as 5% of 
common variance. This actually comes close 
to independence, and therefore any such 
interpretations must be made with caution. 
Any inferences that might be based on non- 
significant or “presumably” zero correla- 
tions also must be made with caution. Zero 
correlations might oceur for either of two 
reasons. They might occur either because 
each adjective in the pair is intrinsically 
unrelated to one another, or because the 
pair, while related, is not relevant to the 
concept being rated. In this latter case, the 
subjects’ responses would randomly fluctu- 
ate around a rating of 4, the range of such 


responses being a function of the stability or 
reliability of ratings of this kind. 

If the adjective pairs used to define the 
scales in Osgood's semantic differential 
(hereafter referred to ав OSD) are in fact 
reciprocally antagonistic, then certain con- 
sequences of this fact should be apparent 
in the correlations. From this bipolarity as- 
sumption, three hypotheses are stated, and 
data which bear on each of them are pre- 
sented. 

Hypothesis I. The correlations between 
the OSD paired opposites should be nega- 
tive on any concept for which they are 
relevant. We present in Table 3 the inter- 
correlations between each of the OSD pairs 
used in this study. The adjectives are sepa- 
rated into two groups according to whether 
they had been judged to be most or least 
obviously bipolar. 

There are several results in Table 3 
which are particularly important. First, 
note that the upper half of the table (i.e., 
most obvious opposites) contains over twice 
as many correlations over .225 as the lower 
half (i.e., least obvious opposites). Prob- 
ably as a result of this, the results in the 
upper half are usually more definitive than 
are those in the lower. Among the most 
obvious adjective pairs one sees that 10 
pairs do in fact show definite negative 
relationships. The clearest of these are the 
pairs: good-bad, clean-dirty, and honest- 
dishonest. Others showing the same nega- 
tive relationship, but not as consistently 
are: large-small, hard-soft, strong-weak, 
happy-sad, long-short, fast-slow, and alive- 
dead. Two pairs show both positive and 
negative correlations: wet-dry, which 
ranges from .25 on SYMPHONY to —.44 on 
TORNADO, and wide-narrow, which varies 
from .21 on влвү to —.29 on TORNADO. 
Finally, two pairs apparently are not polar 
opposites in any sense of the word, in that 
the sign of every correlation obtained was 
positive. These are: red-green, which was 
significantly positive in 8 of 10 instances, 
and pleasant-unpleasant, which was signifi- 
cantly positive in two cases. Only one pair 
produced no correlations as high as .225— 
hot-cold, which ranged from —.15 on LADY 
to .11 on MOTHER. 

Among the least obvious OSD adjective 
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TABLE 3 


CoRRELATIONS BETWEEN SEMANTIC 


DIFFERENTIAL PAIRED OPPOSITES 


[Number of | 


z Es EE 
Adjective pair “| across 
— | con- 
BABY | МЕ OR LADY | SIN сор | 5ТАТОЕ| СОР RS мотнев| + | — e 
TA) | 
Most obvious | 
good-bad —33 | —30 | —09 | —34 | —39 | —46 | —16 | —45 | —27 | —34 0 | 10 | —30 
large-small —17 | —37 | —21 03 | —09 | —54 | —09 | —14 | —38 | —21 | 1| 9} 06 
hard-soft —15 | —32 06 | —33 | —03 04 | —28 | —17 | —32 | —20 | 2| 8| 08 
strong-weak —17| —21 | —19 | —26 | —15 | —51 | —15 | —20 | —49 | —47 | 0 | 10 | —21 
clean-dirty —25 | —42 | —17 | —21 | —33 | —47 | —46 | —38 | —34 | —32 | 0 | 10 | —37 
red-green 08 27 46 23 12 32 32 39 24 44 |10| 0 52 
pleasant-unpleas- 
ant 07 21 02 10 39 10 11 14 24 13 | 10| 0 14 
happy-sad —80 | —30 02 | —13 | —25 | —22 16 | —20 | —16 | —30 | 2| 8, —18 
wet-dry —12 06 25 04 20 22 | —04 10 | —44 09 7| 3 19 
long-short —05 |.—50 | —11 | —10 03 | —03 09 | —12 | —26 | —32 | 2| 8 01 
hot-cold 06 00 01 | —15 08 | —02 | —10 | —10 | —07 111 4| 5 10 
honest-dishonest |—22 | —48 | —11 | —35 | —16 | —56 03 | —60 09 | —34 | 2| 8| -232 
fast-slow —29 | —32 09 | —11 | —24 | —23 | —01 | —34 | —43 | —26 1 9| —12 
wide-narrow 21 | —28 03 06 05| —22 | —16 04 | —29 ol; 6, 4 20 
alive-dead —14 | —36 | —23 | —23 | —18 | —48 | —31 | —09 | —28 | —17 | 0 | 10 | —31 
Number of г 
above .225 LT 3 6 5 8 4 5s 12 8 5 
Least obvious 
tasty-distasteful |—02 | —06 | —13 | —12 | —23 | —06 | —11 | —12 | —05 07| 1| 9| —08 
sharp-dull 22 | —05 11 25 | —01 14 17 03 | —19 08| 7| 3 19 
heavy-light —14 | —16 | —10 | —18 | —29 01 | —02 02 | —15 | —08 | 2, 8 14 
sacred-profane —13 | —10 03| —14 | —28 | —26 01 | —03 | -01 | —07 | 2| 8 01 
thick-thin —10 | —32 15 | —09 07 | —03 | —03 | —20 | —10 | —19 | 2| 8 10 
nice-awful —18 | —27 | —13 | —22 | —42 | —28 | —13 | —29 | —38 | —37 | 0 | 10| —13 
bright-dark —06 02| 02 01 | —34 | —18 | —13 | —14 | —13 | 212 | 83 | 7 04 
bass-treble 05 | —04 18 02 06 05 24 | —10 05 03| 8| 2 32 
angular-rounded |—17 | —08| 09) 03| 05|—01|—04| 00| 04|—1| 4] 5| 23 
fragrant-foul —18 | —08 07 | —11 | —20 00 08 | —07 | —15 | —13 | 2, 7 | —03 
active-passive DLE ДЗО 291: |:-16 [И®-48, 1-99 #—880| —42:| —20.| 0:| 10—38 
rugged-delicate |—07 | —35 | 11 | —03 | —18 | —18 | —12| —11 | -37 | —13 | 1 9| 02 
pungent-bland 3212: ово 10102 | ado elu 10") 121208. 7 | Shenae 
calm-agitated —24 | —38 (ps bes ло ceat eor] 0 1] 0 en 
Number of r's 
above .225 1 4 1 2 5 3 a8 Ан „2 5 
calm-excitable —25 | —28 | —16 | —36 | —12 | —31 | —14 | —40 | —25 | —40 0 | 10 | —26 
calm-excited —23 | —18 | —01 | —24 | —16 | —20 00 | —29 | —36 | —29 | 0| 9 19 


Note.—Decimals have been omitted from the correlations. 


pairs in the lower half of Table 3, only 
three appear to be opposed to each other: 
niee-awful, active-passive, and calm-agi- 
tated. Six pairs showed almost no relation- 
ship of consequence on any of the concepts: 
tasty-distasteful, heavy-light, bass-treble, 
angular-rounded, fragrant-foul, and pun- 


gent-bland. When one looks at the possible 
relevance of these adjectives to the con- 
cepts being rated, it becomes immediately 
obvious why little or no relationship was 
found. These concepts are not normally de- 
fined in these terms. It may well be, then, 
that the generally lower levels of correla- 


— 3 


е 
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tion among the least obvious pairs are due 
as much to irrelevant pairings of adjec- 
tives with concepts as to the fact (if it is a 
fact) that these word pairs are in general 
less clearly opposed in meaning than the 
most obvious pairs. 

When one looks at the correlations of 
the various adjective pairs on the various 
concepts in the two halves of Table 3, a 
few surprises are evident. The concepts 
giving rise to the greatest number of high 
correlations are: ME, GOD, TORNADO, and 
MOTHER, although the trend is not so clear 
in the lower half of the table as it is in the 
upper. BABY, SYMPHONY, and STATUE pro- 
duced the least number of correlations. One 
of the highest correlations on SIN was .39 
between pleasant and unpleasant. BABY has 
only one correlation as high as .30. This 
is also true of SYMPHONY, but in this latter 
case the correlation is positive: .46 be- 
tween red and green. The concepts under 
which the highest set of significant correla- 
tions occurred were TORNADO and ME, but the 
highest single correlation was —.60 be- 
tween honest-dishonest under сор. The con- 
cept cop had the greatest number of corre- 
lations above .40. 

From these findings, it would appear that 
certain of the concepts used by Osgood 
et al. are probably more generally agreed 
upon in definition than are others. We do 
not normally think of ME, Gop, and MOTHER 
as simple concepts. There appears to be 
definitely more agreement as to certain of 
the essential characteristics of these con- 
cepts, however, than there is on many of 
the others. Tornano shows similar agree- 
ment, but is in a somewhat different class 
of concepts. сор also seems fairly well 
defined. By way of contrast, BABY, LADY, and 
SIN were much more poorly defined by these 
adjective pairs than any other concepts, ex- 
cept possibly for SYMPHONY and STATUE. 

Hypothesis II. If the OSD paired adjec- 
tives are in fact polar opposites, then the 
correlations between any two members of 
a pair should typically result in higher 
negative values than the correlations be- 
tween nonpaired adjectives. That is, if two 
adjectives are in fact polar opposites, then 
using cach of them separately would be 
equivalent to setting up two scales to 


measure the same dimenson. This would 
be somewhat the same as setting up а 
method for estimating the reliability with 
which a dimension may be used. For practi- 
cal purposes, the reliability of a scale re- 
presents its expected upper limit of correla- 
tion. 

Of course, there may be concepts for 
which two or more OSD scales themselves 
are highly correlated. Due to differences 
in scale reliability, it is possible that the 
two adjectives of the more reliable scale 
might correlate higher with the adjectives 
defining the less reliable scale than these 
latter adjectives would with each other. 
Some allowance needs to be made for this 
possibility as this hypothesis is considered. 
It is Also possible that in some instances 
the same adjective really might be ap- 
propriate to more than one OSD type of 
seale. Bright, for instance, reasonably 
might be opposed to either dark or dull 
and would have very different. meanings 
in the two scales. It was precisely this 
possibility of ambiguity, however, which 
was used as one of the criteria for dividing 
the adjectives into the most and least ob- 
viously opposite groupings. 

There are, then, two possible bases on 
which any effort to evaluate this hypothesis 
might be challenged. First is the possibility 
that the same adjective might legitimately 
be involved in more than one OSD scale. 
When this is the case, the adjective should 
be low in reliability, and, unless both are 
equally complex, only one member of the 
pair as used by us should meet the erite- 
rion—the one which is intrinsically more 
reliable. Second is the possibility that the 
pairs included in this study might not re- 
present the best possible selection to give 
the hypothesis a fair test. Since no effort 
was made to be sure that adjective pairs 
having unbalanced complexity were in- 
cluded in our list, it may be that we have 
biased the results in the direction of con- 
firmation. 

Recall now that 13 of the adjective pairs 
appearing in Table 3 were potentially defen- 
sible as being polar opposites. The highest 
negative correlation of each of the 26 ad- 
jectives among these 13 pairs with any 
other adjective was found, and the 9 of the 
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13 pairs which yielded the most favorable e Г, 
evidenee for the bipolarity prec n the correlation coefficients support 

presented in Table 4. Similar correlations negate the hypothesis? No п аб. 
for 4 other adjectives (pleasant, unpleasant, yes or no answer to this question seems pos 
excitable, and exeited) are also shown in sible. Bipolarity does not appear to exist as 


TABLE 4 
T.tLUsTHATIONS or Hiouxer Necative CORRELATIONS AMONO SELECTED Aosectives 
= pe ŘŮŮŮŮŮŘ Á 
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Note.—See Table 1 for adjective code numbers. When no adjective number appears, the correlation 
was with the OSD pair. Decimals have been omitted from the correlations. 
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dishonest; the bad cop is not honest; the 
good sin is not dirty; and, interestingly, the 
good Gop is not dishonest. 

There are two sets of scales shown in 
Table 4 which were included because they 
have some additional bearing on the prob- 
lem of empirically determining the way in 
which adjectives should be paired if one 
is going to construct bipolar scales. At the 
bottom of Table 4, one finds calm followed 
in turn by agitated, excitable, and excited. 
The OSD scale originally was calm-agi- 
tated, but later was changed to calm-excit- 
able (Osgood et al., 1957). According to 
our findings, the adjective which generally 
seems to be used in a manner that is op- 
posite to calm is excitable. It is interesting 
to note that when Osgood et al. made the 
change to calm-excitable, the scale ob- 
tained a “purer” loading on the activity 
dimension. 

The other scale adjectives relevant to 
the above point are our findings with 
pleasant and unpleasant. Interestingly 
enough, pleasant and unpleasant were never 
significantly negatively correlated with each 
other. One can wonder whether either of 
these adjectives has any reasonably con- 
sistent functional opposite. It turns out that 
unpleasant has a significant negative co- 
efficient with some adjective on every con- 
cept except sTATUE and that its highest 
negative coefficient is with good for all 
concepts except ME and SIN. On ME it was 
opposed to happy (—.39); on SIN it was a 
tie between happy and good at —.47 for 
each. Even for ме, the correlation with good 
was close to that with happy, so it would 
appear that we could reasonably nominate 
good as a functional opposite to unpleasant. 
'To do so, however, seems to create more 
problems than it solves. For instance, the 
functional opposite of bad is good for seven 
of the eoncepts. Good is opposed to un- 
pleasant on six coneepts and has its highest 
opposition with bad only once (BABY), where 
there is а tie with unpleasant. Happy has 
more significant correlations opposed to un- 
pleasant (four times) than any other ad- 
jective, but sad is opposed to happy more 
than to any other. It would appear that 
there may be no real clear-cut pairings of 
these adjectives into functionally opposite 


and mutually exclusive bipolar seales, 
insofar as this approach to pairing is con- 
cerned. 

Pleasant showed only two coefficients of 
much size: —.42 with clean on stn and 
—.35 with soft on TORNADO. It would seem 
that, except for the possibility that we 
propagate the expectation that the dirtier 
a SIN is the more pleasant it will be, pleas- 
ant has no meaningful functional opposite 
among this set of adjectives! 

There are some additional interrelation- 
ships among adjectives in the data that 
might be interesting to point out. For in- 
stanee, as defined by Osgood et al, the 
opposite of nice is awful. According to our 
data, the funetional opposite of nice is not 
awful, but distasteful. This finding holds 
for 8 of the 10 concepts. Clearly, if scales 
are constructed according to highest nega- 
tive correlations, niee could be used in op- 
position to either of two adjectives—but 
perhaps with the greater consistency of 
meaningfulness in opposition to distasteful 
than to awful. Many specific instances of 
this phenomenon could be pointed out. In 
six instances good has its highest negative 
correlation with unpleasant, even though 
in two to four of these the difference in 
favor of good-unpleasant over good-bad is 
either tied or small (depending on one’s 
definition of small). But they are sensible. 
The good Basy is not unpleasant. If I am 
good, I am not unpleasant, etc. In all these 
cases involving unpleasant except one (à 
bad мотнек being not clean), if the concept 
is bad it is not good. One could continue 
with this rather fascinating play on adjec- 
tives as they are used in relation to con- 
cepts, but time and space will not permit. : 

Finally, а somewhat surprising result is 
also illustrated in the final column of Table 
4. The results obtained among adjectives 
after correlating across concepts (as well 
as subjects) are amazingly unpredictable 
from the separate concept correlations. 
They seem almost capricious in some 1n- 
stances. For example, slow has its highest 
negative correlation with fast eight times 
on individual concepts, and with alive only 
once. Despite this pairing of slow with fast 
on an individual concept basis, slow was 
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TABLE 5 
CORRELATIONS nyTWEEN Various OSD Pains ann Seuecray Tumo Apssorives 
= ج ا‎ ae 
| om" 
Adjective pair — ——- 5 | Correlated 
pany | wm | үөү лар | юх | om starve | cor | neo (мотив — 
Cmm шш sini ld эш. т“ 
happy-unpleasant —26 | —39 | —08 | —26 | -47 | -26 | —01 | -38 | —% | -4 —20 
sad-unpleasant | 19| 27| 08) 47) 3 15| 12 | | 6| 24 БШ 
cold-dead 36| 32] 38| 35| 05| 32| 32| 28|—04| 35 37 
hot-dead 19 | 02|-089| -00| 02) 07|-02|-02| of 05 12 
good-happy 42| 30| 2| 5| 46) 46| 36| 41| 39| 34] 55 
bad-happy -09|-1| 01|—23| —28 | -28| -06 | -19| —13| -14| —18 
honest-bad -27 | —26| —34 | —31 | —18 | —40 | -12 | - —22 | —32 —39 
dishonest-bad 2| | 21| 57] 55| ss] 33| |-и 70| ôl 
dirty-unpleasant | 45 47 16 48 67 45| 46 59 60| 4 | 69 
clean-unpleasant -16| 98 1—16 | -14 | -2 | -21| و‎ | -34 | -21 | -M| —29 
wide-short 04|-02| —16 | 09| 08|—-09| 26|—09| —M —08 | 18 
narrow-short —08 | —06 13 09 16 21 09 | * 12 | 07 04 22 
dead-cold 30| 32| ss| 35|] 05| | 32| 2| 0| 35 37 
alive-cold от | -17| —28 | -23 | -o | Lar | —33 | —06'] 05|-M| 25 
! 


Note.—Decimals have been omitted from the correlations. 


paired with alive when we correlated across 
concepts. Unpleasant never had its highest 
negative coefficient with honest on the 
separate concepts, but did upon correlating 
across concepts. Correlating across concepts 
apparently is not yielding simple averages 
of the correlation coefficients. Exactly what 
it is yielding is not very clear from these 
data. 

Hypothesis III. ЇЇ two adjectives are 
indeed semantic opposites, and one finds a 
correlation between one member of the pair 
and a third adjective, one should expect to 
find an essentially equal but opposite cor- 
relation between this third adjective and 
the other member of the semantic pair. 
Seven of such pairings of most obviously 
opposite adjectives are presented in Table 
5. A rather different kind of result appeared 
in this table as compared to Tables 3 and 4. 
In the earlier tables we saw that several 
adjective pairs were in some measure 
treated by raters as if they were semantic 
opposites, although which pairs did so 
depended in part on the concept being rated. 
In Table 5 we see that it usually occurs 
that one adjective paired in turn against 
each of the two of an OSD pair does not 
correlate with them in a manner that is 
symmetrical around zero. The two adjec- 
tives that are on the “same” end of a 
seale (i.e, both on the desirable end, or 


both on the undesirable end) correlate much 
higher with each other than they do with 
adjectives that are on the opposite end of 
the continuum.? For instance, happy cor- 
relates substantially with good but virtually 
not at all with bad. Dishonest. correlates 
well with bad and not so highly with good. 
Unpleasant correlates well with dirty but 
not nearly so well with clean. Dead corre- 
lates with cold but not with hot. These 
results offer preliminary evidence against 
the likelihood that the adjective pairs in 
our most obviously opposite category are 
really opposites in any absolute sense. 
Summary of Correlational Results. While 
the correlational results are certainly sug- 
gestive, we feel that they point more to 
directions for further research rather than 
to firm conclusions. Some of our correla- 
tional results appear to justify thinking of 
certain adjective pairs as being semantic 
opposites; others do not. Even where adjec- 
tives are related in expected ways, how- 
ever, inferences that should follow from 
such relationships fail to be supported (i.e., 
functionally, they represent the most ap- 
propriate opposites). In general, the cor- 
relational evidence seems to justify the 


2 One of the few exceptions to this trend found 
in our survey (which was not exhaustive) was 
that for some concepts, unpleasant correlated with 
happy greater than it did with sad. 
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conclusion that adjectives are used more 
independently than might have been usually 
suspected. We further feel that to the extent 
that bipolarity is a reality, it tends to be 
specific to the concepts being defined rather 
than being generalized to all concepts. If 
there are completely general semantic op- 
posites, they probably are very few in 
number. 

The size of the correlations themselves 
contributes to the above conclusions. Their 
magnitudes are such that we cannot say 
that any word pairs are so closely inter- 
dependent that they must be considered 
semantic opposites. We have spoken pre- 
viously of the need for caution in interpret- 
ing low coefficients. One might wonder to 
what extent the coefficients we obtained are 
so often relatively low because of intrinsic 
unreliability of the subjects’ rating be- 
havior. Among all the correlations reviewed 
the highest positive coefficients found were 
.75 between honest and good on the concept 
GOD, .73 between honest and good on the 
concept LADY, .70 between bad and dis- 
honest on the concept MOTHER, and .67 be- 
tween dirty and unpleasant on sin. These 
correlations suggest that the intrinsic reli- 
ability of ratings of this kind may well be 
in the range of .70 or better. The difference 
in magnitude between positive correlations 
resulting from same-pole adjectives versus 
the negative correlations resulting from the 
opposite-pole adjectives probably is a real 
and repeatable phenomenon. If so, then 
polar opposites must each have unique 
variance and fail to define completely a 
truly bipolar scale. 


Factor Analyses 


, From the correlational results it appeared 
impossible to rule out clearly and unequivo- 
cally the possibility that there are such 
things as attribute scales which are bipolar 
and which have adjectives that properly 
define the opposing poles of such scales. This 
being so, it remains an open question as 
to whether or not factorial analyses of the 
correlation matrices will result in the de- 
scription of bipolar factors. We turn now 
therefore, to the results of 22 factor analy- 
ses which were carried out in order to 


answer this question. We shall use parts of 
these analyses to explore certain hypotheses 
similar to those explored in the section on 
correlational results. Many of the findings, 
however, will be presented without the 
statement of formal hypotheses. 

Hypothesis I. If there is a generalized 
semantic space which is made up of a 
series of bipolar factors, this should appear 
when one analyzes scores obtained by cor- 
relating scales across a set of concepts. If, 
on the other hand, there is no generally 
applicable semantic space, then this analysis 
across concepts will simply give us a sort 
of average semantic space which would be 
specific to the set of concepts being used. 
In either case, the nonapplicability of a 
class of adjectives to the concepts being 
rated could result in analyses of single con- 
cepts failing to produce some part of the 
across-concepts space. 

Hypothesis II. If the OSD adjective pairs 
are truly semantic opposites, we should 
find them loaded approximately equally, 
but with opposite signs, on the same factor. 

Hypothesis III. To the exent that bipolar 
factors are found, they should occur more 
frequently for the scales we have called 
most obvious polar opposites than for the 
scales we have labeled least obvious. 

To investigate the above hypotheses, the 
factor results are presented in two sections. 
The results obtained by correlating scales 
across concepts in the manner used by 
Osgood et al. are presented first. The load- 
ings obtained for the most and least obvious 
bipolar adjective scales are presented sepa- 
rately in Tables 6 and 7, respectively. For 
convenience, only loadings of .30 or above 
are shown, although lower loadings are in 
general quite similar to the results to be 
reported here. From Table 6, one sees that 
the most obvious scales resulted in. one 
clearly bipolar factor (II) ; this corresponds 
to the Evaluative factor obtained by Osgood 
et al. (1957). The factor has five positive 
and five negative loadings, three of them 
being paired opposites in the OSD scales. 
Among the other eight factors which had 
loadings above -30, all but one were com- 
pletely devoid of any suggestion of bipo- 
larity. The one, Factor VII, had four load- 
ings over .30, and one of these was opposed 
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TABLE 6 


COMPARISON or OSGOOD FACTORS WITH RESULTS OBTAINED BY CORRELATING “Most Onviovs" 
ADJECTIVES ACROSS CONCEPTS 


Most obvious factor мы = et al. 
ictor 
Adjective pair B EG | ` Adjective pair 
Stream- E P+ P=? [Nebbi Maso- n 
lined | лү oma dj uve E Nur e Pd | it 
| | | 
| | - | Ё 
good 75 | | good- 88 | 
bad —54 31 33 | bad 
I 
large 67 large- | 62 | 34 
small 81 small | 
hard 48 41 һага- —48) 55 
soft 50 soft | 
strong 70 A strong- 62 
weak 66 weak 
clean 65 | clean- 82 
dirty —49 43 39 dirty 
red 78 red- —33 35 
green 69 31 green 
pleasant 48 43 | pleasant- 82 
unpleasant —58 35 35 unpleasant 
happy 52 41 happy- 76 
sad 34 53 sad 
wet 68 wet- 
dry —62 dry 
long 61 34 long- 34 
short 76 short 
hot 70 hot- 46 
cold —41 48 cold 
honest 72 honest- 85 
dishonest —41 55 34 dishonest 
fast 38 fast- 70 
slow —38 | 44 slow 
wide 65 36 wide- 41 
narrow 35 49 | —38 narrow 
alive 35 | 38 40 alive- a 
dead 32 33 39 dead 


= H indi in thi ; decimals have been omitted. 
Note.—Only faetor loadings of .30 or higher are indicated in this table; decima ; i 

* А КЫША actual factor loading is available, Osgood et al. (1957) report that subjects classified 
this scale together with other scales having their loadings on the Activity dimension. 


to the rest in sign, but was not an OSD pair іп five clearly unipolar factors, one probably 
of any ot Who DUE bipolar factor, and one doubtful factor. 


From Table 7, one sees that the analysis The bipolar factor (VI) had three loadings 
of the least obvious set of scales resulted which were on active (.63), passive (—.49), 
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TABLE 7 


Comparison or Oscoop Factors WITH RESULTS OBTAINED BY CORRELATING “Least OBvIOUs” 
ÅDJECTIVES ACROSS CONCEPTS 


Least obvious factor Osgood et al. factor 
Adjective Adjective pairs E P = 
кетн Ч CR БА E vi I I. 2m 
tast; 47 51 tasty- 77 
distasteful —75 distasteful 
sha: H —36 | sharp- 52 
mii —65 —40 dull 
heavy 31 —45 | heavy- —36 62 
light —57 33 light 
sacred —53 $ sacred- 81 
profane —56 profane 
thick 71 thick- 44 
thin 48 —33 thin 
nice —61 nice- 87 
awful —60 awful 
bright —58 bright- 69 
dark 44 —37 dark 
bass 51 bass- —99 47 
treble 60 treble 
angular 60 angular- 43 
rounded —32 41 rounded 
fragrant —32 62 fragrant- 84 
foul 42 —60 foul 
active —31 63 active- 59 
passive —44 —49 passive 
rugged —48 | rugged- —42 60 
delicate —53 delicate 
pungent 64 pungent —30 26 
bland 49 —36 bland 
calm —51 35 calm- 61 —36 
agitated —62 —83 agitated 
excitable —82 
excited 71 


Note.—Only factor loadings of .30 ог higher are indicated in this table; decimals have been omitted. 


and dull (—.40). This factor apparently is 
similar to at least part of the Activity factor 
of Osgood et al., although this is not neces- 
sarily the best name for it. Factor III may 
be bipolar, but the positive pole is too 
poorly defined to be sure. 


How do these results compare with the 
stated hypotheses? Obviously there appears 
to be some small measure of agreement, but, 
as one would anticipate from the correla- 
tional results, the degree of agreement is 
very limited. A few OSD paired opposites 


| 
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did occur—three in the most obvious set of 
scales and one in the least obvious set. In all 
four cases the positive loadings have a 
somewhat higher absolute value than the 
negative loadings have. Therefore, while 
symmetry was not really complete, it still 
was close enough to permit saying that one 
(Factor II in Table 6) and possibly both 
(also Factor VI in Table 7) of these factors 
are bipolar. 

Hypothesis IV. Factor analysis should 
serve as a useful guide either to the proper 
definition of bipolar scales if they exist, or 
to the discovery of functional opposites 
which might act as scales, whether they 
represent any attributes of existing systems 
or not. Concerning these possibilities, one 
should immediately note in Table 6 that 
the adjective unpleasant appears on the 
bipolar evaluative factor (Factor II), but 
that the opposite for it presumed by Osgood 
et al. does not. One is tempted to conclude 
that happy should be paired with unpleas- 
ant on this factor; at least this would help 
to give it an appearance of symmetry. It 
will be remembered from Table 4 that un- 
pleasant usually had its highest negative 
correlation with good but sometimes with 
happy. Happy often had its highest cor- 
relation with unpleasant but almost as often 
with cold (a very unpleasant state!). Cold, 
too, appears on the factor, which is quite 
consistent with the correlations. It might 
be reasonable to think of unpleasant as 
being opposed to either good or happy, and 
happy as being opposed to either unpleas- 
ant or cold. If unpleasant and happy were 
to be paired as opposites on a scale, this 
would have to be done in spite of the fact 
they appear with like signs as secondary 
loadings on Factor IV. 

On Factor IV the adjective happy has a 
large number of strange associates, and its 
presence in that factor might be a “sam- 
pling” error that could have occurred at 
any stage of the analysis, including the 
rotations. On Factor IX, however, pleasant 
also appears together with some seemingly 
unrelated adjectives. In both factors, the 
predominant tone of the adjectives is nega- 
tive, and yet these two very positive adjec- 
tives are positively related to their respec- 


tive sets of companions. Apparently, we 
have to entertain the possibility that being 
happy and being pleasant can be positively 
related to various bad or negative attri- 
butes. 

The failure of pleasant and unpleasant to 
appear as opposites on the same factor has 
occurred in a previous study as well. In the 
results of Green and Nowlis (1957), who 
were studying the structure of mood (or 
affective states), a factor emerged that 
could be called pleasantness, but which had 
no negative counterpart. Words intended to 
identify the unpleasant end of an assumed 
bipolar factor instead split up into various 
specifie types of unpleasant feelings such as 
anxiety and depression. 

It will be remembered that the OSD pair, 
red and green, correlated positively. They 
appear on the same unipolar factor (Table 
6, Factor VI) along with such adjectives 
as pleasant, hot, dishonest, long, and nar- 
row, although the loading of green on the 
factor is secondary. Green also has a major 
or primary loading on Factor V. In Factor 
VI, red seems to imply, or at least to covary 
with hot, while in Factor V, green seems to 
be related to something wet and perhaps 
even cool. This sounds as if the two were 
opposites, but the factors occur on quite 
independent axes. Clearly, red and green are 
not semantic opposites so far as this set of 
concepts is concerned. 

Eyen though names have been attached 
to several of the factors shown in Tables 6 
and 7, we do not feel that any serious effort 
to defend these particular labels should be 
made; the fruitfulness of doing so is not 
immediately apparent. We have, of course, 
indicated any substantial agreement which 
might have occurred between our factors 
and the “big three” of the OSD factors. As 
we have seen, there was some such corre- 
spondence, even though this was unstable 
from the analysis of one set of adjectives to 
the other. 

As was apparent in the correlational 
results, and as will be even more obvious in 
the next section covering the single concept 
analyses, there is a great deal of concept- 
seale interaction. This fact is not so clear 
in the two across-concepts factor analyses. 
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TABLE 8 
Farroa Пане or Dusieiscat Coscerr Ахалтака Usıxo "Моят Onvtocs" Ansecriyes 


йм Yee 
кыйы КИП | 
- ~—j ! miwiv vm | x X |xi|xm 
60/54/34 | 14) 08 0-2 |10| 30 |10 04 
mor n n. 1 | о Е | 
70 [23| 41 | 05 | 22 13 | 24 | 10 | 
LUI ю 4 1 2 0 1 
> Bi? EBi E 
so | 61) 05 | 50| 06 0-1 | 04 | 
тилот | 9$ | 0 0 
100 | 50 [14 |24|05 0:2 | 04 
мәт NE 2 1 | 
E | Bi?A Bi | 
$3 |50| 22 |12|40 6-0 | 1-1 | 
us 9 0 2 3 1 0 0 
BE Bi? | 
66 |40| 30 | 30 | 3.0 24 | 50| 13 | 0-1 
LE n 2 4 4 1 1 | 
Bi E Bi? Bi | 
i 
60 |05 |32 |21 |20 1-2 | 30 | 24 | 1.0 
"тате и 2 4 2 1 1 1 
Bi |Bi E Bi? [E |Bi?P 
44 |50| 7-0 |30 | 5-0 0-0 
cor 7 1 2 3 
Bi E 
51 |50| 3-0 |22 | 13 3-0 | 4-5 | 30 | 1-1 
TORNADO ю 4 4 ?|1 3 1 
| Bi PBi P Bi Bi A 
32 | 50| 24 | 24 | 12 
MOTWER 9| 3 5 2 1 Ab: "d Ead 
Bi E Bi?P|Bi ЕВі P E 
Note.—' 


The three rows in each сей under the factor columns refer to: (a) the number of itive and 
ended аре о negative loadings over 30 on the factor; (b) the number of Osgood et al. эйе pais 
loaded with opposite signs; (c) judgment that the factor was definitely bipolar (bi) or questionably 
bipolar (Bi?) and which Osgood et al. factor, if any, was matched in total or in part by the factor. 

з factor was not bipolar, then the factor label (when it was appropriate) is in the right corner if 
mainly composed of adjectives from the positive and, and in the left corner if negative. 
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judgments for each factor concerning bipo- 
larity; and, finally, factor matches, in whole 
or in part, with the three main factors of 
the semantic differential. 

Heeause there were so few factors with 
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paired against their 
kenn m =e definitely label the factor 
as bipolar. exceptions to this general 
rule occurred when the opposed loadings 
whieh did occur were both high and there- 
fore approximately equal. In other in- 
stances, we have entered a question mark 
in the cell to indicate a possibility 
that at least some bipo was present. 
In a few cases, this possibility is indicated 


average of 9.7 per concept. Thirty of these 
97 factors had at least one of the OSD most 
obvious pai on it with opposite 
signs, Eighteen of the 30 factors were in our 
udgment clearly interpretable as bipolar 
D least obvious adjectives yielded 
96 factors, 22 of which had at least one 
OSD pair loaded on it with opposite signs; 
of these 22 we felt that 7 could be clearly 
interpreted as bipolar. Among the most 
obvious adjective analyses, 11 of the 18 
clearly bipolar factors occurred on the 
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none. 

Hypothesis III, as stated in the previous 
section, was clearly supported in that there 
is a greater likelihood that bipolar factors 
will be found if one selects adjective pairs 
which appear to be unambiguous in their 
opposing meanings. Even so, it is also clear 
that the more typical outcome is for 
unimodal (ie. nonbipolar) factors to ap- 
pear. When bipolarity does appear, it tends 
to be concentrated on certain concepts. This 
outcome, however, interacts with the par- 
ticular set of adjectives used. It would ap- 
pear that the use of certain adjectives as 
opposing ends of bipolar continua becomes 
crystallized in relation to certain concepts 
and not to others. This interaction is most 
strikingly illustrated by the first factor ob- 
tained for the concepts SIN, GOD, COP, and 
MOTHER with the most obvious scales, These 
four concepts, all of which are closely as- 
sociated with morality in our society, re- 
sulted in a relatively clear-cut bipolar 
evaluative factor. For the “least obvious” 
scales, this bipolar evaluative dimension 
was not nearly so well defined. 

In Tables 8 and 9 we have pointed to any 
partial or relatively complete correspond- 
ences between our factors and the big three 
of the semantic differential. According to 
the data presented in Table 8, 23 of our 


—— € ——— НРЦ 


е 


Ox тив Пачала or Бема Brun п 


/ eon for most obvious adjectives were 
made up predominantly of adjeetives from 
OSD Evaluative scales, 10 of these being 

"shown as either bipolar or with a question 
mark; 17 factors were made up of predomi- 
-nantly OSD Potency adjectives, 10 being 
: as either bipolar or with a question 
, with 3 of the 10 occurring on the con- 
Ep TORNADO; 3 factors were made up of 
predominantly OSD Activity adjectives, all 
$ which were bipolar or questioned as such. 
"Table 9, 18 of our factors for least 
adjectives were made up of pre- 
do tly OSD Evaluative adjectives, 8 
‘af which were bipolar or possibly so; only 
factors were made up of predominantly 
Potency adjectives, 4 of them being 
Е or possibly so; 18 of the factors 
were made up predominantly of OSD Ac- 
Hivity adjectives, 9 of them being bipolar or 
"potentially bipolar. Obviously, if one ac- 
cepts the OSD definitions of these factors 
and if one holds that the occurrence of sub- 


f 
OSD in general, then we have here support 
for the existence of Evaluative, Potency, 
and Activity factors. The support, if it is 
such, is in terms of the occurrence of a large 


being unipolar than bipolar. The kind and 
extent of support also varies with the subset 
of adjectives used. 

Because of the nature of our evidence, 
we do not believe that we have developed 
Support for the notion of generalized (i.e., 
across-concepts) semantic space which can 
Meaningfully be defined by the bipolar 
dimensions of Evaluation, Potency, and 
Activity. The makeup of the factors is too 
varied and too much dependent on the par- 
ticular concept involved to warrant this. 
We do feel, however, that these individual 
concept analyses do yield support for the 
idea that there are in fact such things as 
bipolar dimensions which are used to de- 
scribe the attributes of certain concepts. 
This observation can be highlighted further 
by turning to a review of Tables 10 and 11. 

Tn the large upper portions of Tables 10 
and 11, we present tabulations of every in- 
stance in which any two of the adjectives 
which formed OSD scales both had loadings 


8 instances of OSD adjective 
being essentially balanced 

e, XXX categories), 11 categories of 
im being in an ambiguous (ie, XX) 
, and 16 instances of pairs being 
unbalanced around zero in that their 
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shows 13 i of essentially balanced 
pairs, 2 pairs in an ambiguous category, 
and 12 pairs that were unbalaneed 


certain adjective pairs are more likely to 
be bipolar and, if bipolar, then also to be 
more symmetrical than others. For the most 
obvious adjectives, the pair that most often 
met both criteria—at least for the set of 
concepts which we used—was clean-dirty. 
Others involving some likelihood of bipo- 
larity were: good-bad, strong-weak, honest- 
dishonest, fast-slow, and alive-dead. In no 
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TABLE 10 
SvwwETRY оғ “Most Onviovs" Apsective Pairs HAVING OPPOSITE SIGN 
LoapiNGS ON THE Same FACTOR E me 
C t E 
E es of] деме: 
э» | ток- Каре analysis 
BABY ME Bain LADY | SIN сор STATUE СОР NADO MOTHER | loadings | 
good-bad | | X | XX x А C X 
large-small X d XXX dm : 
hard-soft x | үе XXX 2 
strong-weak | | XX 8 XX Bing T 
clean-dirty XXX X | XXX XXX 6 х: 
red-green Pos | Pos | Pos Pos Pos 
pene moe Pos 0 Pos 
happy-sad | | XXX] s yf и». XX 2 
"CIN Pos | XXX XXX — 2 
long-short XX ; x 2 
hot-cold x м R 1 А 
honest-dishonest X X XXX y X 4 X 
fast-slow XXX| XX _ | XX XXX 4 
wide-narrow Pos X | md X XX 3 
alive-dead XXX 
| XX |XXX*| XXX XXX 5 
Freq. of oppos. 

loadings 1 7 0 3 4 7 5 4 8 6 3 
Freq. of XXX, XX, 

&X 1-0-0 | 1-2-4 | 0-0-0 | 0-1-2 | 1-2-1 | 4-1-2 | 4-0-1 | 2-1-1 4-3-1 | 1-1-4 0-1-2 
happy-unpleasant x XX 20:9 ххх 4 XXX 
good-unpleasant XXX| XXX X X X XXX) 7 XX 
happy-cold XXX| X 2 XXX 


Note.—X indieates that the scale is not 
adjectives is over .20). 


XX indicates that the scale is somewhat 


tween adjectives is between .15 and 20) 
is essentially balanced around zero (i.e., absolute difference 


XXX indicates that the scale 
between adjectives is less than -15). 


“Pos” indicates that the adjective loadin 
^ This scale was essentially balanced around zer: 


case was an adjective pair bipolar on all 
concepts, although this could have been due 
to sampling fluctuations at some stage of 
the various processes used. 

Among the least obvious adjective pairs 
(Table 11), no concepts had more than four 
bipolar pairs of adjectives. There were, 


however, three pairs of adjectives that 
Showed a rather definite tendeney to be 
bipolar: nice-awful on Seven concepts, 


calm-agitated on Seven, and active-passive 
on five. 

At the very bottom of Tables 10 and 11, 
there are illustrated certain facets of the 
problem of empirieally pairing adjectives. 
Table 10 indieates the number of times 


balanced around zero (1.е., absolute difference between 


balanced around zero (i.e., absolute difference be- 


gs on a given factor had the same sign. 
О on two separate factors for this concept. 


happy-unpleasant appeared together with 
opposite signs. This happened on four con- 
cepts and on the across-concept analysis, 
which was more often than it occurred for 
the OSD pair happy-sad. In a sense, then, 
unpleasant might be a better adjective to 
use with happy than sad. However, the 
problem gets further complicated by the 
fact that unpleasant seems to have a bet- 
ter opposition with good than it does with 
happy. Unpleasant was opposed to good 
seven times, and three of those were bal- 
anced around zero. In fact, unpleasant was 
somewhat more consistently opposed to 
good than bad was! 


Further complications in the problem of 


a, .——— 
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TABLE 11 
SYMMETRY OF “Least Osvious” Apsective Pars Havina OrrostrTE SIGN 
A LOADINGS ON THE SAME FACTOR 
Concept | Fre 
Seale кр 
BABY ME Жоке LADY | sıx сор | 8тАТОЕ| cor E мотика | реше | analysis 
x TI mds та | | 
tasty-distasteful 0 
sharp-dull Pos | | 0 
heavy-light | 0 Pos 
sacred-profane XX | X | | 2 
thick-thin XXX x XXX| 3 
nice-awful XXX X X |XXX| X |XXX| X 7 
bright-dark XXX 171 
bass-treble 0 
angular-rounded F0 
fragrant-foul | 0 
active-passive x XXX| XXX XXX| X 5 XXX 
rugged-delicate XXX| Pos t XXX 2 
pungent-bland Pos 0 | 
calm-agitated XXX| X XX|X X SX SOCIO T rE 
Frequency of op- 
posite loadings 2 3 1 1 4 3 2 4 4 3 2 
Frequency of XXX, 
XX, & X 1-0-1 | 2-0-1 | 1-0-0 | 0-1-0 | 1-1-2 | 1-0-2 | 2-0-0 | 1-0-3 2-0-2 | 2-0-1 1-0-1 
| 
calm-excitable x X X |XXX X XXX| 6 x 
calm-excited xi x X | XXX x X | XX 7 x 
bright-dull X XX | 2 
nice-distasteful XXX| X xxx|xXX| X | XX | XX | X 8 
active-dull XX ISS XXX| XXX] XXX XX |XXX| X 8 x 


Note.—See footnote for Table 10. 


pairing adjectives are illustrated in Table 
11. First, our interest in comparing the 
appropriateness of the adjectives agitated, 
excited, and excitable as opposites of ealm 
seems to have resulted in a sort of stand-off. 
According to these findings all three can 
be used, and all three will be bipolar to 
about the same extent on essentially the 
same concepts. Second, it is clear that nice 
and distasteful ean be used as opposites. 
They occurred together with opposite signs 
8 out of the 10 times. Tasty and distasteful 
never did occur together—possibly because 
they simply are not relevant to the set of 
concepts used. Nice did appear with awful 
with opposite signs on the same factors on 
seven concepts, all but one also being in- 
cluded in the eight instances in which nice 
and distasteful occurred with opposite signs 
on the same factor; perhaps nice should be 
more appropriately paired with distasteful. 
Finally, active and dull appeared together 


eight times. Sharp and dull never appeared 
together, but active did appear with passive 
five times. Consequently, active and dull 
would seem to make the more consistent 
pairing. 

The hazard of setting up adjective pairs 
as opposing poles of scales without some 
kind of empirical check on the validity of 
the pairings is also illustrated in Tables 10 
and 11. For instance, pleasant-unpleasant 
did not appear on any factor with loadings 
having opposite signs, although they did 
appear with the same sign on one concept— 
stn. The loadings were .74 and .61, respec- 
tively, so this could scarcely have occurred 
as a sampling fluctuation. Red and green 
appeared together with similar signs on six 
concepts. Obviously, we can now say with 
assurance that these two adjectives should 
not be considered as semantic opposites. 
The adjective pair wet-dry appeared with 
approximately equal loadings and the same 


TABLE 12 


ILLUSTRATION SETS or Factor LoADINGS FOR 
“Most Onviovs" ADJECTIVES 


i | Our 


| | osp | Jedgments 
Concept! Factor Adjective Loading | load- т 
mE | Fac | Bipo- 
tor | larity 
BABY \ red —39 | A 
green —43| A 
| pleasant -32| E ? | No 
| wide —41| P 
narrow —61| P 
BABY | VII | large —71 | PA 
hard —34 | PE 
strong —# | P 
pleasant -42| E P | No 
dry —39 | —* 
long —55| P 
ME IV | red -57 | A 
green —54 | A 
wet -85| — ? | No 
dry —36 | — 
wide —49| P 
SIN I | pleasant 61| E 
unpleasant 74| E 
good —48| E 
bad 66| E 
large 32| PA| E Е 
clean —36| E 
dirty 72| E 
happy —4/ E 
sad 50| E 
cold 38 {| A 
dishonest 62| E 
SIN III | large —31 | PA 
weak 43| P 
alive —51| А ? ? 
dead 32| A 
сор | VIII | clean 41| E 
alive 40| A 
dead oT ear 
pleasant —41| E ? ? 
cold кө jena} 
narrow —50| P 
GOD X |small 50| PA 
large —60 | PA 
strong —39| P | P | Yes 
wide —601| P 
ES Me S Prius iui АШИ ioa А МИР 


Note.—Decimals have been omitted from the 
factor loadings. 

* The dash (—) indicates that the scale did not 
load substantially on any of the OSD factors. 


GREEN AND GOLDFRIED 


signs on the concept ME, but also appeared 
with approximately equal loadings and op- 
posite signs on STATUE and TORNADO. 

Many readers might well feel frustrated 
at looking at all these results without being 
able to see the numbers on which they are 
based. For this reason, several of the kinds 
of considerations and problems which were 
involved in attempting to summarize and 
interpret the results are illustrated by the 
sample set of factor loadings presented in 
Tables 12 and 13. Some of the problems 
concerning the question of defining bipo- 
larity are illustrated by Factors I and П 
from stv and by Factors VIII and X from 
Gop in Table 12 and by Factor VI from 
BABY, Factor XI from cop, and Factor VII 
from сор in Table 13. 

Factor I from sın in Table 12 had several 
difficulties. First, pleasant and unpleasant 
had high positive loadings. Second, sad, 
dirty, and bad all had loadings higher than 
did happy, clean, and good. Honest had 
а loading of —.24 and so it was not included 
in the table. This factor may well have 
been bipolar, but the negative loadings 
were too few and too low to adequately 
influence the final location of the axis which 
defined the factor. The reluctance to label 
this faetor as being bipolar was also due 
to the fact that large, cold, and dishonest 
did not have opposed OSD pairs with load- 
ings large enough to be included in the 
table. 

Factor VIII on cop from Table 12 might 
well have been called bipolar, but we have 
listed it as questionable here, as we have 
in Table 8. This factor was largely made 
up of adjectives that were not opposed pairs 
in the OSD system. Does it involve se- 
mantie opposition? If it does, it is very 
difficult to account for pleasant as being 
opposed to clean and alive, but directly 
related to dead, cold, and narrow. 

Turning to Factor VI from pany in Table 
13, there were four loadings, two of which 
were OSD opposites (passive and active). 
Inasmuch as this pair was far from bal- 
anced, we have listed this factor as being 
of questionable bipolarity. 

Factor XI from the concept вор had three 
loadings, one of which was opposed to the 
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TABLE 13 
¥ ILLUSTRATIVE SETS OF Factor LOADINGS FOR 
“LEAST OBVIOUS” ADJECTIVES 


E E 
Concept хаз Adjective Toa: 05р 
| | Fac-| Віро- 
| tor | larity 
BABY II | distate- | —66 
ful 
nice 72 E |E!|?(Not 
dull —33| A OSD 
| pair) 
BABY VI | passive 61| A 
active —31 A 
dull 465 A | Al? 
bland 57| EA 
BABY | XII| sharp —48| A 
thin —59| P ? | No 
angular | —44| А 
ME III | nice 6 E 
bright 68 E 
distaste- | —38, E | E | No 
ful 
active 33) A 
dull —32| A 
ME IV | heavy 65| PE 
thin —60| P 
thick 56, P 
rounded 65 A |P?|? 
pungent 32| EA 
ME VI | tasty —60 E 
fragrant | —45| E | E | No 
dull 31| А 
GoD XI | dark 50 E 
light —46 PE | E |? (Not 
distaste- 32 E OSD 
ful pair) 
STATUE | I | excited 30 А 
profane 45 E 
bass 54 PE | ? | No 
foul 61| E 
MOTHER| V | heavy —58| РЕ 
thick —44 P |P | Yes 
thin 53| P 
rounded | —44| А 
cop VII | passive 50 A 
active —603| A 
dull 4| A 
bland 39 EA | A |? 
profane 32 E 
distate- 30 E 
ful 


ing Ота Decimals have been omitted from the factor load- 


other two, but none of which was from the 
OSD pairings. The opposed adjectives 
(dark-light) were, however, semantically 
opposed to each other. This factor was not 
considered to be clearly bipolar, in that 
these adjectives did not form an OSD op- 
posing pair. 

Finally, Factor VII on cor had five pos- 
itive and only one negative loading. The 
negative loading on active was paired with 
its OSD opposite (ie. passive) and was 
high. It was well enough determined to be 
called bipolar. If, however, this was to be 
called a bipolar factor, then the other four 
adjectives should also have opposing pairs. 
Because active-passive was the only bipolar 
pair, the bipolafity of this factor was con- 
sidered to be questionable. 

Problems of factor matching are illus- 
trated by those factors shown in Tables 12 
and 13 which have a question mark under 
the column referring to our judgments of 
the factor label. Factor V from ВАВҮ was 
loaded with adjectives from all three of the 
OSD factors (Table 12); Factor III from 
sry had two of them about equally weighted. 
Some matches, however, were quite good. 
For example, Factor I from SIN included a 
large number of the OSD evaluative ad- 
jectives. Other factors had only two or three 
adjectives, even though all might not have 
been from the same OSD set. Thus, Factor 
ІІ from BABY (Table 13) had two large 
opposed loadings from OSD evaluative ad- 
jectives. The two were not from the same 
OSD seale, but it would appear that for 
apy they could just as well have been used 
to define a single scale. 

Summary of Factorial Results. In many 
respects, the factorial results are similar to 
and in fact are predictable from the corre- 
lation results. The factorial results, do, 
however, add some important information 
concerning the probable reality of the pres- 
ence of at least some bipolar factors. Under 
the most lenient definition of bipolarity 
that could be adopted, only 20% of the 
factors obtained in the two across-concepts 
analyses could be called bipolar. Similarly, 
only 25% of the factors obtained from the 
20 individual-concept analyses could be 
called bipolar. In the large majority of 
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cases, no bipolarity occurred. Even во, cer- 
tain adjective pairs are more likely to be 
bipolar than others, and if bipolar, then 
also to be balanced around zero. These pairs 
are more likely to be in our most obviously 
opposite classification than in our least 
obvious eategory. Inasmuch as these latter 
results were inferred from the bipolarity 
hypothesis, it appeared that there is in fact 
some tendeney for reciprocally antagonistic 
adjectives to form the poles of coresponding 
soales. The appearance of this tendency, 
however, is clearly dependent on the con- 
cept or set of concepts involved, and there- 
fore is not necessarily a generalized phe- 
nomenon. 


GENERAL Discussion AND CONCLUSIONS 


Finally, now, we can attempt to outline 
some of the implications of these results 
for the structure of semantic space in gen- 
eral and for the semantic differential format 


in particular. 

It will be recalled that Osgood et al. 
assumed that the intrinsic nature of the 
language is such that we tend to utilize 
bipolar adjectival continua along which we 
describe the connotative meaning of con- 
cepts. Osgood and his associates further 
assumed that these adjective pairs in turn 
are interrelated in a manner that would 
allow them to be described by a limited set 
of axes in a factorially defined semantic 
space. 

We have now seen that when subjects 
are not forced to conform to the kinds of 
seales constructed by Osgood et al.—when 
they are not forced to make judgments on 
presumed bipolar scales—they more often 
than not use each adjective as if it were a 
unipolar scale and not simply one end of a 
bipolar scale. However, the fact that at 
times some scales do appear to be bipolar 
should not be underemphasized, This fact 
appears to leave us in an ambiguous situa- 
tion. Some of our evidence appears to sup- 
port a kind of bipolarity, i.e., the existence 
of reciprocally antagonistic verbal oppo- 
sites at the polar points of attribute scales. 
Other evidence appears to negate it. 

In order to work toward a resolution of 


this problem, let us proceed by asking a 
series of questions and exploring possible 
implieations and. answers. 


Question Number 1. 

For any two adjectives which we typi- 
eally think of as being antonyms, does the 
failure to find a significant negative cor- 
relation on some concepts, but not on 
others, negate the hypothesis of intrinsie 
reciprocal antagonism between them? The 
notion of intrinsic antagonism implies that 
the two adjectives should form a fixed and 
constant relationship. If such a relationship 
exists at all, under what conditions might 
it result in the expected negative relation- 
ship for some concepts, but in a zero or 
even positive relationship for others? 

One can readily conceive of a situation 
in which a subject is asked to rate a concept 
which is cognitively complex. A concept 
may be said to be cognitively complex if 
it is characterized as possessing a number 
of distinguishable and more or less inde- 
pendent attributes. MoTHER, for example, 
can have many negative as well as positive 
characteristics. Each of these character- 
isties can in turn be described as being 
related to such dimensions as good-bad, 
happy-sad, or pleasant-unpleasant. Hence, 
there is the likelihood that when applying 
a given bipolar scale to a cognitively com- 
plex concept, subjects may simultaneously 
react with antagonistie mediators and pro- 
duee an “averaged-out” response. 

Consider for a moment what happens 
when a subject is asked to rate a concept 
on the standard, bipolar format of the se- 
mantic differential. Since he is able to make 
only one rating per scale, his response may 
indieate either one of two phenomena: (a) 
а single mediational response to the concept, 
€.8., Tm, or, (b) in the case of a cognitively 
complex concept, the net result of two an- 
tagonistic mediational reactions, e.g., Im 
minus In. In this second instance, if the 
two mediational response tendencies are of 
equal strength, the subject will place his 
averaged-out rating in the 4-category (i.e., 
both sides of the scale are equally associ- 
ated with the concept). If the two medi- 
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are of unequal strength, then the 
aged-out” rating will be placed either 
left or right of the continuum. 
has to be careful to distinguish 
possible circumstances in whieh an 
yeraging-out process might occur and 
hat the consequences of each might be. 
tually, rating a cognitively complex con- 
on OSD types of scales could require 
‘two kinds of averaging. It could require an 
fing across a series of concept attri- 
tes relevant to the dimension along which 
rating occurs, and, in addition, it may 
e an averaging due to the paired ad- 
es not being intrinsically opposites 
he concept being rated. If a subject is 
d to rate the concept with a single- 
clive form of the semantic differential, 
ever, the second kind of averaging will 
be avoided. The ratings obtained, conse- 
quently, would be less ambiguous. 
— HM, when using a single-adjective scale 
‘to rate a complex concept, a subject really 
only certain attributes and not others, 
if the subset of attributes across which 
_ he averages is different for the OSD oppo- 
- site adjective, then this too could account 
for differential results on the OSD versus 
_ the single-adjective scales. Again, the sin- 
gle-adjective scales would result in less con- 
founding in the ratings. Note, however, that 
for this to occur it would have to be true 
the two adjectives in an OSD pair are 
‘differentially associated with different char- 
acteristics of the same concept. This should 
not be possible if they are intrinsically an- 
tagonistic. 
— It is essential to note that the absence of 
negative correlations between OSD paired 
adjectives does not imply that there is no 
"true variance" associated with subjects’ 
ratings, That this is so is supported by the 
correlation coefficients. For instance, even 
though hot and cold correlate zero on ME, 
hot correlates .25 with strong, while cold 
correlates, .42 with unpleasant and 40 with 
dirty. There were several such examples in 
the data. Another was wide versus narrow 
correlating .01 on мотнев, while wide versus 
large was .46, and narrow versus long was 
. 329. Virtually all such examples in the data 
seem reasonable in terms of what is co- 
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varying with what. But, more important, 
they illustrate the presence of true vari- 
ance. The fact that the correlations them- 
selves are usually unbalanced, both in size 
and in terms of which adjectives are in- 
volved, implies that the adjective pairs 
involved are to a considerable extent in- 
dependent and, therefore, cannot be con- 
sidered semantic opposites in а funetional 
sense, It would seem that over-all, general- 
ized, habitual, or stereotyped responses to 
cognitively complex concepts do form, and 
that they are often relatively specific to 
one of an OSD adjective pair rather than 
to both. 

We might mention here two other ways 
in which cognitive complexity ean occur. 
It ean*oecur when the individual members 
of a concept class have different amounts 
of the attribute which the scale represents, 
For instance, the members of the class of 
compositions called SYMPHONIES vary in 
how fast or slow they are. A response to 
the concept SYMPHONY on the scale fast 
would, therefore, have to be an averaged 
response somewhere on the scale. The sec- 
ond way in which complexity can occur 
comes from the fact that an indivi 
member of the concept class can be variable 
from time to time on an attribute being 
rated. These two last sources of complexity 
are often confounded in that some concepts 
are recognized as being quite variable both 
within and between members of its class, 
such as SYMPHONY, on fast and slow. On the 
other hand, some concepts are perceived 
as being much less variable in both these 
senses, such as TORNADO, on fast and slow. 
The presence of bipolarity may well be 
related to one or any combination of 
these kinds of complexity. For instance, 
fast and slow did form a balanced bipolar 
factor on TORNADO, but did not on SYMPHONY. 
This would suggest that when complexity 
is not present, the pertinent paired ad- 
jecives will be found to be semantic op- 
posites. One can point, however, to dis- 
crepaneies in this explanation for this kind 
of result. Take, for example, the concept 
statue. Among the attributes (defined by 
dimensions) which would seem to be rel- 
atively nonvariable between or within 
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wraTum аге: hard-soft, clean-dirty, wet- 
dry, alive-dead, hot-cold, fast-slow, and 
perhaps strong-weak, The first four of 
resulted in balaneed bipolar factors, but the 
last three did not. There does not appear 
to be any immediate explanation for this 
lack of à Д 

Ideally, ару test for bipolarity should 
eliminate as much as possible the confound- 
ing variable of cognitive complexity. This 
conceivably may be achieved by testing the 
bipolarity assumption on cognitively ho- 
mogeneous concepts, such as simple tones. 
Greater simplicity might additionally be 
achieved by using duller “cognizers” as 
subjects; obsessive individuals can prob- 
ably find conflicting attributes in even the 
most simple of concepts! 


Question Number 2. 


Does the failure of supposed polar op- 
posites to appear as such with opposite 
signs on a factor negate the hypothesis of 
intrinsic reciprocal antagonism between 
them? This question is, of course, much the 
same as Question Number l, and many of 
the same considerations hold in regard to 
it. There is one additional point which is 
relevant here, however. It is quite possible 
that the factorial methods used to investi- 
gate the problem were coarse enough to 
produce our findings, а% least part of the 
time. As has already been pointed out, how- 
ever, the factorial results are in the main 
consistent, with what would be expected in 
the light of the correlational results. In fact, 
evidence for the presence of intrinsic bi- 
polarity’s occurring on at least some con- 
cepts was stronger in the factorial results 
than it was in the correlational results. But, 


results would 
happen. 

In taking the position that the data do 
not adequately support a conclusion in 
favor of intrinsic bipolarity, it is important 
to distinguish again between functional bi- 
polarity and intrinsic bipolarity. It does 
seem defensible to hold that there is a 


functional use of adjective pairs as oppo- 
sites in relation to certain concepts and not 
to others, It is necessary to make this dis- 
tinction, because we agree that it is still 
possible that people do in fact learn rela- 
tionships between pairs of adjectives that 
in а purely abstract sense are reciprocally 
antagonistic. If they do, however, it is by 
no means clear that what is learned as an 
abstraction is carried over into functional 
use in any absolute sense. For instance, it 
is very clear that certain word pairs which 
almost anyone will agree are rational op- 
posites, such as pleasant-unpleasant, are 
not functional opposites at all. At least they 
are not functional opposites when the sub- 
ject is left free to use them in what seems 
to us to be a reasonably normal manner. It 
would also appear that for most adjective 
pairs in which abstract opposition may in 
fact be learned the emergence of functional 
opposition will depend on the concept. Even 
when the adjective loadings are balanced 
and high enough on an axis to be called а 
factor, their loadings are rarely if ever 
sufficient to account for more than half of 
their probable true variance. 


Question Number 3. 


Might our results be in part explained in 
terms of the population we sampled? This 
too is a methodological problem. It must 
be remembered that our data consisted of 
responses obtained from a group of univer- 
sity students. If the work were replicated 
using subjeets who were less capable “cog- 
nizers" and therefore less likely to consider 
the several characteristics of the concepts, 
would bipolarity occur more frequently in 
the results? Conceivably it might, and such 
a replication would seem to be worthwhile. 
In the meantime, however, it is of impor- 
tance that within the sample used the re- 
sults came out as they did. This is par- 
ticularly so in that much of the application 
of the OSD has involved the use of samples 
similar to ours. In fact, college students 
served as subjects in most of the original 
factor analyses carried out by Osgood et al. 
(1957). We are pointing here to the pos- 
sibility of levels by “treatment” interac- 
tions returning to haunt investigators who 
neglect the fact that there is at present no 
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answer to the problem of the un- 


‘Is there any meaningful sense in which 
fe can speak of something that can be 
epted as “semantic space"? We find 
reason to believe that there is, in fact, 
a thing as a generalized, across-con- 
“cepts semantic space. One can readily agree 
that there is a factorial structure which 
reasonably be called the semantic 
e of a concept. However, this structure 
will differ a great deal from one concept 
0 another. For the concepts we have used 
it will contain an average of about nine 
factors; it will usually require more than 
ree of these factors to account for most 
of the true variance associated with the 
| f scales which might be used. 
| Question uestion Number 5. 
18 the semantic space of a concept bi- 
polar? Again, the answer is that it depends 
on the concept. The space associated with 
a concept appears to be composed largely 
` of a variety of unipolar factors, but there 
- Will often be one or more bipolar factors 
associated with it as well. For at least а 
few concepts there may well be several 
bipolar factors. Apparently this may de- 
d heavily on the extent to which we use 
b oth of some pair of adjectives in relation 
_ to the concept. Such pairings are not neces- 


sarily limited to adjectives we normally 
think of as antonyms. Hence, setting up 
such pairings in relation to particular con- 
cepts should be guided by correlational or 
factorial methods, rather than by subjec- 
tive opinion. 
` The evaluation of bipolarity rests not 
only on the assumption that adjectives are 
used as antonyms but also that they are 
used in a relatively symmetrical fashion. 
Аз indicated in Tables 10 and 11, the func- 
tional symmetry of the “most obviously” 
‘opposite OSD ‘scales is better than the 
symmetry of the “least obvious” scales. 
On more of an absolute basis, however, the 
nt of symmetrical usage leaves much 
о be desired. This finding confirms Ross 
and Levy’s (1960) similar conclusion that 
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the conception of antonyms as representing 
opposite and equal forces in semantic space 
is an artificial one. 
Question Number 6. 

What implications do our results have for 
the bipolar format of the semantic differen- 
tial? The balance of the evidence presented 
indicates that it is likely that Osgood et al. 
have in fact imposed an arbitrary and arti- 
ficial structure in the domain they call gen- 
eralized semantic space; “selective subjecti- 
vism" has in fact had much to do with the 
structure of the semantic differential. This 
of itself is not necessarily bad or improper. 
It does mean, however, that the continued 
use of scales ss they are currently con- 
structéd should proceed with more caution 
than might have been exercised in the past. 
Clearly, the bipolar semantic differential 
cannot be thought of as a means of pin- 
pointing a concept’s position in anything 
more than an arbitrary three-dimensional 
bipolar system. 

In drawing inferences from results, one 
needs to take into account the properties 
of the measuring instrument used and to 
consider carefully the relevance of the 
measure to the particular problem. No one 
can argue with the use of a scale that pro- 
duces positive results which are useful in 
psychological theory. One may well feel 
that this has already happened to a suffi- 
cient degree to more than justify the se- 
mantic differential as a tool for general use 
in psychology. We would propose, however, 
that interpretations of results obtained from 
use of this instrument should be made with 
full knowledge of its probable arbitrary 
and artificial character. 


SUMMARY 


The purpose of this paper was to inves- 
tigate the validity of the basic assumption 
which was made by Osgood and his asso- 
ciates when they undertook their project 
of describing and measuring semantic space. 
They assumed that semantic space is bi- 
polar. This assumption was derived from 
the prior assumption that individuals at- 
tribute meaning to signs along bipolar (ad- 
jectival opposites) continua. This latter 
assumption led to the construction of ad- 
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jectival seales which were bipolar, and this 
required that their investigation of seman- 
tie space must show it to be bipolar. 

To investigate the validity of the as- 
sumptions of Osgood and his associates, 58 
single-adjective scales were constructed in 
whieh 251 college student subjects were 
asked to rate concepts according to whether 
or not the adjective was positively or neg- 
atively related to the concept. The adjec- 
tives were then analyzed in two subsets, 
30 consisting of the 15 semantic differential 
(OSD) pairs which were judged to be most 
likely to be bipolar and 28 consisting of the 
14 OSD pairs judged to be least likely to 
be bipolar. The data were correlated and 
faetor analyzed, both concept by concept 
and summed across concepts. 

Over twice as many negative correlations 
over .225 were obtained between the most 
obviously bipolar pairs than were obtained 
between the least obvious opposites. Теп 
of the 15 most obvious pairs had definite 
negative correlations on half or more of the 
concepts; 2 pairs had both positive and 
negative correlations, depending on the con- 
cept; 2 pairs had all positive correlations = 
1 pair had no correlations as high as .225, 
either positive or negative. Far more signifi- 
cant correlations between OSD pairs ос- 
curred on some concepts than on others. 

There were only three of the OSD pairs 
for which the negative correlation between 
the two adjectives was higher than that of 
either adjective with any third adjective 
in half or more of the concepts. These were: 
strong-weak, clean-dirty, and fast-slow. 

It usually oceurs that when a third ad- 
jective is correlated in turn against each 
of the two of an OSD pair, the two cor- 
relation coefficients are not symmetrical 
around zero. The positive correlation, even 
though it be between undesirable adjectives, 
is usually the higher, 

From the correlational results it seemed 
clear that various inferences that should 
follow if semantic Space were intrinsically 
bipolar were not confirmed, Rather, the 
evidence pointed to a conclusion that to 
the extent that bipolarity is a reality it 
tends to be specific to the concepts being 
defined. 


The factor analytic results were, as ex- 


pected, in general agreement with the cor- 
relational results. The factor analyses of 
the summed across-concepts data produced 
one clearly bipolar factor among nine ex- 
tracted from the most obviously bipolar 
adjectives, and one probably bipolar factor 
from the seven extracted from the least 
obviously bipolar adjectives. The 20 factor 
analyses of single concepts produced a total 
of 193 factors with at least one nonvanish- 
ing loading. Of the 97 factors yielded by 
the most obviously bipolar scales, 18 were 
clearly interpretable as bipolar; of the 96 
yielded by the least obviously bipolar 
scales, 7 could be so interpreted. Forty- 
three of the 97 factors from the most 
obvious scales consisted of subsets of ad- 
jectives from the OSD Evaluation (23), 
Potency (17), and Activity (3) factors. 
Forty-six of the 96 factors from the least 
obvious scales consisted of subsets of the 
OSD big three: Evaluation (18), Potency 
(10), and Activity (18). About half in each 
case were either bipolar or could be ques- 
tioned as such. Among the 193 factors, in- 
cluding 25 which were bipolar, there were 
only 31 instances of clearly balanced and 
opposite-signed adjective pairs and 13 more 
instances of ambiguous interpretability. Of 
the former 31, 18 were from the “most 
clearly” bipolar adjectives, and of the lat- 
ter 13, 11 were. In total, then, there was a 
greater likelihood that bipolar factors and 
symmetrical adjectival pairs would be 
found among the most obviously bipolar 
adjectives. The typical outcome, however, 
was for factors to be unimodal and for 
adjectives to be relatively independent of 
their presumed opposites, 

There were wide differences in occurrence 
of bipolar factors on concepts. There were 
also wide differences in the occurrence of 
OSD paired opposites across scales. Any 
tendency for reciprocally antagonistic ad- 
jectives to form the two poles of a single 
scale was clearly dependent on the concept 
or set of concepts involved and therefore 
was not a generalized phenomenon. The 
Specific, situation-bound character of the 
results was well illustrated by the fact that 
many of the results seemed to lead to in- 
teresting views of our attitudes and perhaps 
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of our eulture. For instance, we find that 
the bad сор is not honest, the honest СОР 
is not bad, the good sın is dirty, and the 
clean stv is not pleasant. Such concept- 
seale interactions point up the need to de- 
fine scales to be used in describing semantic 
space empirically through correlational 
analyses and factor analyses. 

The weight of evidence of this paper, 
then, is that Osgood and his associates have 
in fact imposed an arbitrary and artificial 
structure in the domain they call general- 
ized semantic space. It does not follow that 
the semantic differential is useless, but it 
does follow that researchers should bear its 
characteristics in mind when they use it to 
obtain and interpret data. 
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lıxce Humphreys (1939) introduced 
{ the binary choice situation as а 
method in psychology, а certain amount of 
standardization of technique has been es- 
tablished in binary choice research. First, 
most such experiments have employed ran- 
dom, noncontingent generating sources, 
either with the two events having an equal 
5 probability of occurrence or with one occur- 
ring more frequently than the other.! Second, 
_ procedure has carefully prescribed that when 
_ {һе Nth prediction (or guess) is made, no 
display of the prior event series be avail- 
able to S (the subject)—see, for example, 
Edwards (1961). Third, a symbology has 
evolved for describing and modeling the 
binary choice phenomenon. The predictions 
of S are symbolized A; and Аз, and the 
reinforeing events (or feedback signals) 
are symbolized E, and Es, the subscripts 
denoting an agreed-upon correspondence of 
the two kinds of events between S and 

1 the experimenter. 
As Estes (1961) has pointed out, most 
‘experiments on binary choice have demon- 
strated that Ss come to match their predic- 


Two exceptions, Hake and Hyman (1953) and 
Anderson (1960), using random event-contingent 
stationary generating sources with first-order con- 
tingencies as main variables, found that Ss tended 

a to match their predictions to the first-order event- 
contingent relations as well as to the basic event 
frequencies. 
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PERFORMANCE ON COMPLEXLY PATTERNED 
BINARY EVENT SEQUENCES 
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Prediction behavior was studied in a context of complexly patterned binary 
sequences. Sequences were generated from nonstationary, event-contingent, 
partially random sources. A variable of major importance was the presence 
or absence of a displayed history of the last eight events in the sequence. Evi- 
dence was found that people seek and find order to some degree in the en- 
vironment. The process by which order is sought and found is discussed. 


portant were much more difficult to learn than were sheer frequency or loca- 
tion, and not relations, were important. 


tions (A,) with the frequency of occurrence 
in the sequence of the E, events after many 
(hundreds) trials. 

Two apparently different basic assump- 
tions have led to two different types of 
theory being applied to binary choice pre- 
diction behavior. The first, exemplified by 
Estes (1950), assumes that S's behavior is 
heavily under control of stimuli and that 
asymptotic behavior will resemble the 
characteristics of the stimulus sequence. 
The second assumption is that Ss bring 
certain hypothesis-generating and hypoth- 
esis-testing behavior to the situation and 
that this behavior will influence both the 
behavioral process (Feldman, 1959; Good- 
now, 1955; Restle, 1961) and the asymp- 
totic behavior (Edwards, 1961). These 
differences may reflect only an interest in 
different aspects of the phenomena. It is 
not obvious to the present authors that the 
differences in assumptions must lead to 
basically different theories. It may be that 
neither a theory which conceives of an 
active stimulus and a malleable organism 
nor one which conceives of a neutral stimu- 
lus and an active organism will be appropri- 
ate. If, as seems reasonable, the characteris- 
tics of stimuli play an active role in the 
behavioral process and, as also seems reason- 
able, the organism has an active role in the 
behavioral process, the characteristics of 
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TABLE 1 
EXPERIMENTAL ARRANGEMENTS FOR CONDITIONS 1 THROUGH 16 
! ойшы [ттт " Sequence Prediction History Number of Ss Topic 
i ES х 1 1 AorB 8-8 26 
2 1 1 Aor B 8-0 8 
3 1 1 Aor B 1-1 7 І 2 
1 2 1 A or B 8-8 26 Effect of history 
5 2 1 AorB 8-0 7 
6 2 1 AorB 1-1 8 
7 3 1 Aor B 8-0 6 
8 2 1 Dors 8-0 6 
E. 1 f Б Оң В T р Effect of response 
3 $ 1 
11 1 1 D or 8 1-0 7 ge 
12 3 1 Dors 8-0 7 
13 1 2 AorB 8-8 20 
14 1 3 А or B 8-8 20 Effect of 
15 2 3 А ог B 8-8 12 sequence 
16 3 4 Dors 8-0 7 


both an environment and an organism must 
be known, and predictions of the conse- 
quences of their interaction become the 
goal of any theory. 

If people behave in a lawful manner 
and their predictions are also influenced by 
the reinforcing events, it seemed that the 
orderliness of behavior would be very dif- 
ficult to find if a random noncontingent 
Sequence were used (since the behavior 
would look like the characteristics of the 
sequence). For this reason, the experiments 
described below used partially random 
event-contingent, nonstationary sequences. 
They were not designed to test any theory 
or model. 

Table 1 lists the first 16 experimental 
conditions of the present set of studies. 
Each use of a given generator will be re- 
ferred to as a condition. As сап be seen 
from the table, a distinction was made be- 
tween the generating source and the event 
sequence. The generating source specifies 
the basie rules by which the binary event 
Sequence is generated. The binary event 
sequence eontains derived rules which are 
a direct consequence of the operation of 
the basic rules and observable in the se- 
quence itself. Any one generating source 
can produce a large number of different 
event sequences, 

The research reported in this paper ex- 


plores the effects on prediction behavior 
of the following variables: 

1. The characteristics of the generating 
Source and the resulting sequences. Three 
logically related, but different, generating 
sources were used. All Ss in any particular 
group, and in several different groups, were 
given the same sequence. As will be seen 
later, even groups getting different generat- 
ing sources had the same sequence on odd- 
numbered trials. To determine the con- 
straining effect of particular sequences, 
additional groups were run with different 
sequences. 

One common characteristic of all the 
generating sources and resulting sequences 
was the presence of partial pattern ot 
order. Certain events were completely pre- 
dietable, while others were uncertain. 

A distinction between symbol events 01 
trials and higher-order states or occur- 
rences is helpful in understanding the na- 
ture of the sequences. The transitional 
probabilities from trial to trial were not 
stationary, but those from state to state 
were, given that “state” is appropriately 
defined for any sequence. 

One way of viewing the sources is 28 
simple languages without punctuation 01 
Spacing between words and phrases. The 
languages all have two-letter alphabets 
Words in some of the languages are tw? 
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letters long, and only some of the pos- 
sible two-letter words exist in the diction- 
ary for that language. In some languages 
there are two words of different lengths, in 
some the transitional probabilities from 
word to word are stationary, and in some 
they are not. However, the highest order 
language in the present series has phrases 
consisting of pairs of words, and the transi- 
tional probabilities from phrase to phrase 
are stationary, while no transitional prob- 
abilities below phrase pairs are. 

Given the orderliness that issues from the 
characteristics of the generating sources, 
the sequences themselves have two kinds 
of derivative patterning, derived rules 
(completely predictable and learnable con- 
tingencies) and descriptive generalization 
(a generalized sub-sequence resembling 
any actual sequence). Examples of both 
kinds of derivative patterning will be 
given below. 

2. Presence or absence of artificial mem- 
ory or display history. In the two-choice 
experiments where Ss were given no history 
or record of past events in the sequence, 
Ss had to rely on their own memories. Of 
course, if events are random, precise mem- 
ory is not very important. However, there 
are differences between individuals' abili- 
ties to remember events. Also, the experi- 
menters could not predict what the Ss’ per- 
formances might be like on the patterned 
sequences. To reduce variability between 
Ss and enhance the possibility of Ss finding 
the patterning, the most recent eight events 
were displayed to some groups of Ss? 

There were three basic display history 
conditions: Ss received either a display of 
the last eight events and their last eight 
predictions, a display of the last eight 
events only, or a display of the most recent 
event and prediction only. The question 
asked was, “How did history contribute to 
the prediction process?” 

3. The prediction response ог mode. 
Here Ss indicated whether an A or а B 
was going to oceur next or whether the next 
event was going to be like or different from 


?'The notion of using a history display was sup- 
ported by Paul Fitts, University of Michigan, а 
consultant at System Development Corporation. 


the prior event (by pressing buttons 
labeled either S or D). 


GENERATING SOURCES AND 
RESULTING SEQUENCES 

Figure 1 shows the generating source re- 
ferred to as Generator 1. On each odd- 
numbered trial, the probability of an А 
(Ei) was .5. On each even-numbered trial, 
an A event always occurred. The overall 
probability of an А event on any trial was 
75. Further constraints were placed on 
the sequences issuing from Generator 1. 
Within specified constraints, all sequences 
were obtained by using tables of random 
numbers. Sequence 1 was permitted a vari- 
ation of no more than 10% over blocks of 
40 trials for the odd-numbered events; 
overall deviation, therefore, was never 
greater than 596 over blocks of 40 trials. 
Also, the proportion of Аз over the total 
1,000 trials of the sequence was required 
to be exactly .5 for the odd-numbered 
trials. Sequence 2 was simply а reversal of 
As and Bs for the odd-numbered trials. A 
third sequence was constructed for reasons 
related to Generator 2 and will be dis- 
cussed below. Sequence 4 was а control 
sequence. 

Constraints on sequences apply only to 
discrete blocks of 40 trials. Slightly greater 
than 5% variations were occasionally ob- 
tained if overlapping blocks were ex- 
amined. 

Tt is obvious that if S knew the generat- 
ing source he would keep track of odd and 
even trials and always predict A on eyen 
trials. What he might do on odd trials 
would not matter. He could just as well 
predict all As and assure himself a maxi- 
mum score. In reality, of course, Ss did 
not know the generating source. What 


odd even 


А Start 


Fic. 1. Generator 1 has all A on even trials. 
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could Ss learn about the sequence itself? 
For sequences emanating from Generator 1, 
they could learn the following two deriv- 
able rules: 

1. Never а B following а B (or, always 
an А following а B) and 

2. Always an odd number of As between 
Bs (or, never a B after an even number 
of As). 

A subtype of Rule 2 (2a) was, “If two 
As, always a third A." 

At the descriptive generalization level, 
Ss could come to recognize the reappear- 
ance of certain sub-sequences of events. 
For sequences emanating from Generator 
1, these were: 

1, Strings of As and 

2. Strings of BAs, with transitional 
stages which were not clearly either one or 
the other. 

Generator 2 had the following charac- 
teristics: On each odd-numbered trial, the 
probability of an A (Е,) event was .5. 
On each even-numbered trial, the event 
was always different from the prior odd 
event. Thus, the overall probability of an 
A occurring on any trial, ignoring trial 
number, was .5. However, the probability 
of an event's being different from the 
event prior to it was .5 on each odd-num- 
bered trial, unity on each even-numbered 
trial, and .75 overall. A score of 75% cor- 
rect can be attained by always predicting 
the opposite event from the one which 
just occurred. Generator 2, then, is logi- 
cally related to Generator 1, as relations 
between pairs of events are to events. The 
same sequence number for Generators 1 
and 2 means that exactly the same se- 
quences of As and Bs were used on odd- 
numbered trials for Generator 2 as for 
Generator 1. 

The first two sequences were controlled 
for event frequencies only, but no attempt 
was made to control the proportion of 
digrams. Therefore, when the input se- 
quence was plotted in terms of relationship 
for Generator 2, the odd trials (even-odd 
digrams) deviated widely from 50%. For 
this reason, a third sequence was con- 
structed whose characteristics changed 
minimally over time, for both event and 
digram frequencies. Each block of 20 odd 


trials contained exactly five digrams of 
each kind (ie., AA, AB, BA, BB). It iol- 
lowed that there were never more than six 
odd As or six odd Bs in a row within 
blocks of 20 odd trials. Controlling event 
frequency has little effect on digrams; they 
may still be randomly distributed. How- 
ever, As soon as constraints are imposed on 
the digram frequencies, limits are auto- 
matically set on the occurrence of single 
events. 

In Generator 2, if S knew the generating 
source, he could keep track of odd and 
even trials and always predict A following 
B and B following A on even trials. What 
he did on odd trials would not matter. He 
could maximize his score by always pre- 
dicting the opposite of what happened on 
each prior trial. Not knowing the generat- 
ing source, Ss were in a similar position 
to those subjected to Generator 1. Two 
derivable rules, exactly comparable to 
those for Generator 1, can be identified: 

1. Never more than two “like” symbols 
in a row and 

2. Always an odd number of single al- 
ternations, or always an opposite pair of 
“like” symbols at either end of a string of 
single alternations (AABABB, never 
BBABABB). 

At the level of descriptive generalization, 
Ss could identify sub-sequences as: 

1, Strings of single alternations (ABAB 
—and BABA—) and 

2. Strings of double alternations (AABB 
— and BBAA—), with transitional stages 
which were not clearly one or the other. 

However, these characteristics refer to 
relations between symbols, and there are 
two forms each can take, making sequences 
from Generator 2 more difficult to handle 
successfully. 

Generator 3 is the logical opposite of 
Generator 2. That is, the even-numbered 
events are always the same as each prior 
odd event. This generator was not part of 
the original design; however, the experi- 
menters formed an interesting hypothesis 
on how Ss perceive and learn sequences 
and incorporated Generator 3 to test it. 
Again, comparable derivable rules can be 
identified, and, at the descriptive general- 
ization level, one can readily specify the 


The sub-sequences for Generator 
pore closely resemble those for Genera- 
-] than Generator 2, despite the closer 
іса! relationship between Generators 2 
1 3. The hypothesis held that perform- 
on Generator 3 would therefore be 
like that on Generator 1. Sub-se- 
сев for Generator 3 are: 

1. Strings of As, 
2 Strings of Bs, and 
; Double alternations. 


cedui 


Subjects. Ss were college students, both 
» and female, paid at an hourly rate. 
en S was used in only one experi- 
condition. 
Apparatus. Each S was seated in а 
“cubicle enclosed on three sides by curtains. 
- On a table before each S there was an 8 
X 12 inch cathode-ray-tube display and а 


Mns on 
~ Chi rs on the display device 


Bs) were generated from a 5 X 
: Response recording, timing, and 
display presentation were done by а Philco 
. transac 8-2000 computer, with sequences 
input on punched cards and postexperi- 
__ mental data printed out. 
- Instructions. Ss were seated in their 
Е" and given а typed set of instrue- 
tions to read. The following instructions 
S Were given only to those Ss who had a dis- 
. play of eight events and predictions. For 
other experimental conditions, the instruc- 
tions were modified appropriately. 


(As or 
7 dot 


In will be presented & 
le long sequence of As and Bs, one letter at a 
time. Your task is to try to predict what each 
r of this sequence will be before it appears on 
TV rocedure we will 
low 


. the figure, (Ss had illustrations.) 2 
__ You then punch and enter your next prediction 
and wait. Again the second letter of the actual se- 
quence will appear on the extreme right of the 


red button at the top of the panel will light up аз 
a warning that you have one second left to punch 
and enter your prediction. When tbe red light goes 
out and the green light comes on, you can no 
longer enter a prediction. 

The green light is also activated by the “Enter” 
button. This means that you can only change your 


what each letter of the sequence will be. At first, 
of course, you will bave nothing to base your pre- 
con- 


dictions on, but eventually you will have 
usn ~ of 


We will stop for a short break in the middle of 
the experiment and resume again where we left off. 


Ss were asked whether they understood 
the instruetions, and any questions were 
answered by paraphrasing the instructions. 
Ss were then given 1,000 trials, interrupted 
in the middle by a 10-minute break, dur- 
ing which no diseussion of the experiment 
was permitted. 


Results: Conditions 1-16 


To assess Ss’ performance, analyses 
were made of both event and digram 
frequencies. A digram analysis was made 
to examine the relation between prediction 
responses and each prior event. The analy- 
sis was done separately for odd-stimulus 
events and even-prediction responses and 
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Fic. 2. Comparison of Ss’ predictions with stim- 
ulus events in Condition 1 (Gen.: 1; resp. mode: 
A,B; hist.: 8-8). Я 


for even-stimulus events and odd-predic- 
tion responses. 

Figures 2 through 5 show the perform- 
ance of groups under representative experi- 
mental conditions on odd, even, and all 
trials compared with the input sequence 
over blocks of 40 trials. Tables 2 through 
4 give a summary of group performance 
for each condition. 

In general, it can be seen that Ss tend 
toward matching their predictions to the 
characteristics of the sequences with some 
groups attaining better matching than 
others. 

Analysis of variance, designed to test 
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үр 3. eae of Ss’ predictions with stim- 
wus events in Condition 3 (Gen.: 1; resp, : 
АВ; hist.: 1-1). ee 
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Fic. 4. Comparison of Ss’ predictions with stim- 
ulus events in Condition 4 (Gen.: 2; resp. mode: 
А,В; hist.: 8-8). 


four experimental variables, yielded the 
results shown in Table 5. 

Matching to the internal statistical char- 
acteristics of the sequence was closer on 
Generator 1 than on Generator 2 (p < 
01), and closer with than without display 
history (p « .01). Over the last 400 trials 
the differences in sequence for given gener- 
ators were unimportant. Response mode 
was thoroughly irrelevant. 

Performance on Generators. The differ- 
ence in performance between Generators 1 
and 2 was striking and expected. In Gen- 
erator 1, S only had to keep track of odd 
and even trials, which meant he had to 
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Fic. 5. Comparison of Ss’ predictions with stim- 
ulus events in Condition 7 (Gen.: 3; resp. mode: 
А,В; hist.: 8-0). 
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TABLE 2 
ЕРРЕСТ OF HISTORY ON Lasr 400 TRIALS 
Condition Generator Relevant variable Баз Ee ODD 3 Ай E 
1 1 % “А” Event 100 51 76 
Prediction 98 
(A,B; 8-8) ^ E 
2 1 % “А” ae ^ p^ 51 76 
rediction 51 75 
(A,B; 8-0) ' 
3 1 % “А” Event _ 100 51 76 
Prediction H 58 7i 
(A,B; 1-1) 
4 «үу» Event 100 54 77 
2 % “D Prediction 88 55 72 
(A,B; 8-8) 
5 ep” Event £ 100 54 77 
: Жр Prediction 93 59 76 
(A,B; 8-0) 
Event 100 54 77 
6 «ту» 
2 Ар Predietion 77 63 70 
(A,B; 1-1) 
1 E) Event 100 54 77 
F 96/78 Prediction 99 45 72 
(A,B; 8-0) 


count the number of As between Bs. The 
derived rules were relatively simple. In 
Generator 2, in addition to keeping track 
of odd and even trials, S also had to re- 


member what the previous event had been 
when he was on an even trial. The rules 
were quite complex. When Generator 3 was 
conceived, it had become clear that the 


TABLE 3 
Erect or RESPONSE MODE ON Last 400 TRIALS 

Condition Generator Relevant variable Odd-EVEN Even-ODD All 

2 Event 100 54 77 

: 2 ae Prediction 93 57 75 
(D,S; 8-0) 

Aan. Event 100 54 77 

; 4 LOMA Prediction 14 59 67 
(D,S; 1-0) 

ТАО Event 100 5l 76 

M " I Prediction 97 52 75 
(D,S; 8-0) 

Event 100 51 76 

п 1 Wo em Prediction 91 61 76 
(D,8; 1-0) 

Event 100 54 7" 

i 3 % “B? Prediction 98 m 72 


(0,8; 8-0) 
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TABLE 4 
Errzcr or SEQUENCE ох Last 400 Trials 

Condition Generator > айша жалым E Odd-EVEN Even-0DD All 

Е, 2" y AT Event 100 49 74 

ы 3 ®^А Prediction 97 48 33 
(A,B; 8-8) 

y AT Event 100 50 75 

" a % “А Prediction 97 48 72 
(A,B; 8-8) 

¥ «үу» Event 100 50 75 

15 2 % "D Prediction 92 56 74 
(A,B; 8-8) 

ugn Event 100 50 75 

» 3 % “8 Prediction 98 51 74 
(0,8; 8-0) 


difficulty of the task was not so much in 
keeping track of relationships and learning 
two parts to each rule but, rather, in dis- 
cerning the necessary descriptive general- 
izations. When S first sees а sequence in 
front of him, he has to organize and code 
it to discover its underlying rules. His 
initial response, then, is to the easily iden- 
tifiable general features of a sequence: 
strings of As, Bs, alternations, etc, 

Like Generator 1, Generator 3 had 
easily codable strings of homogeneous 
events. Generator 2 consisted of small, 
constantly changing units of one or two 
like events in a row. 

One phenomenon found in Generator 2 
and not in Generators 1 or 3 was a very 
high agreement among Ss as to what event 
was going to occur on the odd trials. This 
was puzzling, since the odd trials came 
from a random 5,.5 sequence. The even 


trials seem to have had a compelling or- | 


ganizing effect on the total sequence. In- 
stead of having strings of As and/or 
strings of Bs as main patterns, we had 
strings of single and double alternations. 
(The intervening transitional stages may 
have been viewed as noise by the Ss.) 
This, in effect, was the one thing Ss could 
learn and whenever they saw those recur- 
ring patterns, they recognized them and 
continued predieting single or double al- 
ternations until the pattern changed. Pat- 
terns always changed on odd trials. Though 
it was not possible to predict what the 
next odd event was going to be, the con- 
Sensus was high that once a pattern was 
established, it would continue. 

In Condition 15, where much agreement 
between Ss on odd trials was noticed, those 
few Ss who made many mistakes on even 
trials tended to disagree on odd trials with 


TABLE 5 
ANALYSIS OF VARIANCE ON DISCRIMINATION BETWEEN Opp AND Even TRIALS 

Source SS af MS F 
History (H) ‚51545 1 -51545 21.695* 
Response Mode R) -00008 1 -00008 -00336 
Generator (G) -47766 1 .47766 20.10506* 
HXR .000 1 .000 0.000 
HxG 04663 1 .04663 1.96257 
RXG .01648 1 .01648 0.69365 
H XRxG 00937 1 .00937 0.39437 
Residual .023758 48 = = 


*p« .01. 


z 


ho made few mistakes on even trials; 
er, the latter tended to agree with 
nother on their predictions for odd 


Performance with and without Display 
ry. Since the display history simu- 
memory, it was expected that Ss 
uld learn more easily when it was pres- 
This was confirmed by the analysis of 
e. It was uncertain, however, 
er the presence or absence of a pre- 
n history in addition to the input his- 
would make any difference in perform- 
The history of Ss’ predictions adds 
as many data to the display, data that 
not really relevant to the task. When Ss’ 
n responses are presented to them, it tends 
_ to make the task somewhat more difficult. 
` Table 6 shows that the groups with both 
event and prediction histories (8-8) did 
not do quite so well on rule learning as the 
groups with only the event history (8-0). 
is only for long, unchanging patterns 
trings of seven or more As in Generator 
r instance), that the prediction history 
_ Gould be helpful to S; assuming he guessed 
occasional Bs on odd trials, he could then 
ily keep traek of where he was by look- 
_ ing at his own responses, even though the 
шз of the event sequence had disap- 
eared from the display. This conjecture 
‘ NE eoe: by the data. 
— When no display history was shown, 
display of the predicted event made no 


‘apparent difference. Though display his- 
ту was а powerful variable, Figure 6 
and even events) without display 

! 


re 


"* 


E the best S (in terms of matching 
ea performed quite comparably with 
ай S with display history on Sequence 

` Response Mode. Regardless of response 

mode, the display always consisted of As 
and Bs. Translating in terms of relations 
apparently was easy enough so that coding 
the responses for Ss was without effect. 

Rule Learning. We have described the 
two derived rules that Ss could learn in 
order to master the different sequences. A 
single straightforward test can be made for 
5 the learning of Rule 1. Since Rule 2 takes 
| many different forms, а more involved 


_ analysis is required. An hypothesis about 


Prnrorwanxce ох Cowrurx Bpusy Sequexwces о 


TABLE 6 


Pexcent оғ Exnous by Posrrios ох тни 
Last 400 Trials 


Ventes 
J J 5 + 
Coeditions 
1 AB;&8 GI 1.0 2.6 3.6 5.4 
2 A,B;80 Gl 4 8 9 10.3 
3 A,B;11 G1 1 10.2 9.7 13.9 
4 A,B;88 G2 28 19.0 13.8 18.2 
5 A.B;80 G2 6 10.9 7.2 21.7 
6 A,B;1-1 G2 159 326 248 21.4 
7 A,B;80 G3 A © 1 6.3 
8 DS;80 G2 3.8 7.1 12.0 9.3 
9 DS;10 G2 131 37.1 31.3 20.8 
10 DS;80 Gl м; 1.5 19 12.0 
11 DS;10 G1 10 n4 1760 200 
12 DS;80 G3 5 .0 2.2 10.8 
13 А,В;88 G1 6 1.8 3.7 13.0 
14 А,В; 8-S G1 .9 3.3 1.7 1.3 
15 А,В; 5-8 G2 1.3 160 11.9 16.2 
16 D,8;80 G3 1 6 13 14.4 
Meo —— 


the performance on Rule 2 was formulated. 
For Generator 1 the authors proposed that 
as the number of A events in a row in- 
creases, the likelihood of errors also in- 
creases. B events then become markers, or 
anchor points, and the further away 8 
was from а В, the more difficult it became 
for him to keep track of the number of As 
that had occurred. Analogous thinking is 
applicable to Generator 3. However, this 
hypothesis is inapplicable to Generator 2 


5 10 15 20 25 


Fic. 6. Comparison of two Ss behavior. S in 
Condition 1 had a display history; S in Condition 


3 did not. 
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because of its previously described peculi- 
arities. 

То test this hypothesis for Generator 1, 
errors were separated by the position in 
which they occurred after B events. Position 
l refers to the first A after а B (Rule 1). 
Position 3 refers to the third A after a B 
and so on (Rule 2). The third position is 
the special case, Rule 2a. Positions 2, 4, 
ete., fall on odd trials and are not relevant. 

Table 6 also shows the percentage of 
errors made at each position for all condi- 
tions. For Generators 1 and 3 the trend 
indieated by the hypothesis seems to be 
confirmed. 

The history variable was also powerful 
in determining the percentage of errors made 
in each condition. 

Several investigators (Goodnow, Ruben- 
stein, & Lubin, 1960; Jarvik, 1951; 
Restle, 1961) have emphasized the influ- 
ence of the length of runs of homogeneous 
events on predietion behavior. Restle has 
presented a model which assumes both a 
random sequence and that Ss perform 
solely on the basis of runs of previous 
events. Whereas runs of homogeneous 
events ean and do occur for some of the 
sequences in the present studies, the lengths 
of runs are not random. To determine the 
effect of the embedded pattern on predic- 
tions during homogeneous runs we per- 
formed an analysis on the data of 16 Ss 
from Condition 1 for Trials 601—1,000: For 
every run of А events we counted the num- 
ber of predictions made that a B event 
would occur. The embedded pattern did 
influence the prediction behavior of these 
Ss during runs of A events. On the first 
three even trials of a run of As, Ss pre- 
dicted less than 5% Bs, while on the first 
three odd trials of a run of As, Ss predicted 


almost exactly 50% Bs. After six As in a 
run, Ss tended to predict fewer Bs. How- 
ever, despite the fact that only 14% of all 
the predictions were Bs from the seventh 
event on in a long run of As, Ss still con- 
tinued to predict over twice as many Bs 
on odd trials as on even trials. 


WORD GENERATORS 


This section describes six experimental 
conditions in which Ss were faced with a 
more complex language or generating 
source. These languages can be viewed as 
having the following characteristics: a two- 
letter alphabet, A and B; a two-word 
dictionary, AB and AAAB; a syntax, or 
set of rules, for selection of any next word; 
sequential delivery to Ss letter-by-letter 
with no punctuation or differential spacing 
between letters, words, or phrases. The 
primary differences among the six condi- 
tions are in the rules for selecting words 
and the syntactical rules. 

The procedure followed for all six con- 
ditions was identical to that for Conditions 
1, 4, 18, 14, and 15, as can be seen from 
Table 7. For Conditions 17, 18, and 19 
there was a direct correspondence with 
some earlier conditions. One word, either 
AB or AAAB, was directly substituted for 
either the letter A or B of Sequence 1 of 
Generator 1 or Sequence 1 of Generator 2. 
Figure 7 shows the substitution for Condi- 
tion 17. Table 8 shows the proportions of 
the letter A and of the word AB in the six 
sequences used in Conditions 17 through 
22. The sequences used in Conditions 17 
and 20 are identical in terms of the pro- 
portion of As and the overall proportion 
of AB words. Conditions 18 and 22 and 19 
and 21 are similarly paired. The sequences 
for Conditions 17, 18, and 19 can be de- 


TABLE 7 
EXPERIMENTAL ARRANGEMENTS FOR CONDITIONS 17 THROUGH 22 
Condition Generator Sequence Prediction History Number of S’s : Topic p 
17 W1 1 A or B 8-8 
- 5 
18 W 1I 1 AorB 8-8 5 
A W2 1 A or B 8-8 ti Word 
2 W 75 5 А ог B 8-8 7 matching 
W 50 6 AorB 8-8 6 
22 W 25 5 Aor B 8-8 6 
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igs Start 
AB 
E DRM 
AB 
Start 
AB Start 


Fro. 7. Only AB on even word occurrences 
in Generator W1. 


scribed as bivariate or nonstationary at 
the word level (the probability of occur- 
rence of a given word is not the same from 
one draw of а word to the next). The se- 
quences for Conditions 20, 21, and 22 can 
be described as univariate at the word 
level. That is, the probability that a given 
word will be chosen on the next draw is 
independent of what word has been 
chosen on the last draw and equal to the 
overall probability of occurrence of that 
word over the entire sequence (there is no 
systematic difference between odd and 
even occurrences of words in the latter 
three conditions) . 

A comparison of Figures 1 and 7 shows 
that the logical structure of the generating 
source is identical for Conditions 1 and 
17. (Comparable relations can be worked 
out between Condition 1 and Condition 18 
and between Condition 4 and Condition 
19.) Wherever a letter A was drawn for 
Sequence 1, the word AB was drawn for 
Generator W1; likewise for the letter B 
and the word AAAB. Generator W1, there- 
fore, can be thought of as a language 
with syntactic structure conditional upon 
odd and even draws of words. 


S's task may be considered as a two- 
level task (though he is not told this): At 
the first or most primitive level, the task 
is to identify the two words, and at the 
second, to identify the rules by which 
the words are drawn. 

At the first level there are three derivable 
local rules. Every time any word occurs 
S has the opportunity to learn the rule, 
“Never a BB.” Each time an AAAB word 
occurs S has the opportunity to learn, “If 
two As, then a third А,” and *Never more 
than three As in а row." With these three 
rules, S can perform the first-level task. 
Further, the only uncertainty remaining 
about prediction of the next letter occurs 
following a BA. “Reinforcement” comes to 
have two meanings after the lower level 
rules have been learned, one meaning for 
the prediction following the events BA and 
one for all other predictions. In the latter 
case, reinforcement has a maintaining or 
confirming effect. Only in the former case 
does it tend to change or modify behavior. 
'The reinforcement of predictions following 
BA events comes to reinforce the complex 
response chain of predicting what the next 
word will be. 

At the second, or syntactic, level of S's 
task there are again derivable local rules. 
For Generator W1 these are exactly analo- 
gous to those rules found in Generator 1. 
Restated, these become for W1, “Never 
AAAB following AAAB,” and “Always an 
odd number of AB words in а row.” Learn- 
ing these syntactic rules would result in a 
tendeney toward matching the predictions 
of words with the characteristics of their 
sequential occurrence. Analysis of the 
proportion of AB words predicted on odd 


TABLE 8 


CHARACTERISTICS OF SEQUENCES Usep Іх CONDI 


trons 17 through 22 


Proportion “AB” 


Proportion “Different” Word 


die e Odd Even All Odd Even All 
17 .60 .50 1 w 2 
18 ‚714 .50 [ : 
19 .67 .50 .50 by .50 1.00 75 
20 .60 = = n 
21 .67 = = ш 
22 ‚7\4 => = : 
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and even word occurrences, therefore, 
should show the same tendencies as did 
the analysis of proportion of A predictions 
on Generator 1 sequences. 

Naturally, if Ss learned the rules at both 
levels, they would match letter frequencies 
very closely as a trivial consequence of 
the more complex order-seeking behavior. 

Generator W2 is for word events what 

Jenerator 2 was for letter events. 


Results: Conditions 17-22 


Learning of First-Order Rules. Learning 
of the first-order rules was very rapid in 
Condition 17. Learning that there were 
two words and what those two words were, 
was effectively complete by Prediction 
Trial 214. No mistakes were made after 
Prediction Trial 320. These trial numbers 
ean be translated as 59 and 85 word oc- 
currences, or opportunities to learn first- 
order rules, Comparable learning of the 
first-order rules occurred in Conditions 18 
through 22. 

Learning of Higher-Order Rules. In 
Condition 17, if Ss did not learn anything 
about the syntax but had learned only the 
word rules, one would expect that the 
word АААВ would be predicted randomly 
on 25% of the trials. Consequently, also 
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Fia. 8. Comparison of Ss' predictions with stim- 
ulus events in Condition 17 (Gen.: W1 ; resp. 
mode: А,В; hist.: 8-8). 
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Fic. 9. Comparison of Ss’ predictions with stim- 
ulus events in Condition 19 (Gen.: W2; resp. 
mode: A,B; hist.: 8-8). 


by chance, the word AAAB would be ex- 
pected to be predicted following an AAAB 
word 25% of the time. But the derived 
syntax rule is that the word AAAB never 
follows itself. Ss did learn this rule: After 
the first 40 words, AAAB was predicted to 
follow itself less than 5%. The second syn- 
tactie rule, always an odd number of AB 
words, was also learned to some extent. 'Тһе 
second AB word in a sequence occurs on 
an odd draw, and Ss continue to predict 
that the word AAAB will occur at about 
the same rate over the entire sequence. 
However, when the third AB in a sequence 
occurs, which is again on an even draw, 
Ss’ predictions that the word will be an 
AAAB decline rather steadily over the 
course of the experiment, reaching 8% over 
the last block of 40 words. For this condi- 
tion, Ss had only .4 number of opportuni- 
ties to learn the syntactic characteristics 
compared with the Ss of Condition 1: The 
400 words were composed of 1,000 letter 
events. 

Given that Ss were tending to learn the 
syntactic rules, we can now ask whether 
this learning resulted in a tendency 10 
match the characteristics of the sequence 
for odd and even occurrences of words. 
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TABLE 9 
ASYMPTOTIC PERFORMANCE ON WORD GENERATORS 
(Last 400 TmraLs om 120 Wonmps) 


Condition рожа Proportion “АВ” Proportion "different" word _ 
Odd Even All Odd Even All 

17 Event .60 47 1.00 -73 MERT у 
Prediction 59 .08 .90 .79 

18 Еуепё 71 .50 .00 .25 
Prediction 71 43 .09 .20 

19 Event .67 AT .52 .50 39 1.0 .70 
Prediction .66 .51 .50 .51 .59 .86 .72 

20 Event .60 — — ‚75 
Prediction .58 — .88 

21 Event .67 —-. 0 
Prediction .65 = .50 

22 Event л2 — .24 
Prediction 71 == =a -28 


Figures 8 and 9 show word predictions of 
Ss in Conditions 17 and 19, respectively, 
plotted as were the predictions of letters in 
Conditions 1 (for 17) and 4 (for 19). 

Table 9 gives a summary of group per- 
formance for Conditions 17 through 22, 
analogous to Tables 2 through 4 for the 
earlier conditions. First, the close agree- 
ment between predictions and letter events 
should be noted. Second, agreement be- 
tween overall predictions and frequency of 
occurrence of AB words in 17, 18, and 19 
can be seen, though it deviates from 
matching in opposite directions on the odd 
and even words. Possibly Ss compensated 
for errors made on even words by over- 
predicting the more frequent word on odd 
occurrences. 

Comparing Figure 8 with Figure 2 shows 
that Ss in Condition 17 were tending to 
discriminate between odd and even occur- 
rences of words just as Ss in Condition 1 
had for letters. Matching over the last 160 
Word events was even closer in Condition 
18 (not shown). 

Figure 9 shows that Ss in Condition 19 
(the word equivalent of Generator 2, the 
relationship “different” generator) showed 
some tendency toward discriminating be- 


tween odd-even and even-odd word rela- 
tions. However, this tendency did not oc- 
cur until the sixth block of 40 words or 
until after the first 200 word occurrences. 
Less accurate matching for word relations 
would be expected, since Ss in Condition 4 
matched less accurately to letter relations 
than Ss in Condition 1 did to letter fre- 
quencies. Even for this difficult task, Ss 
show some evidence of finding the higher 
order characteristics of the sequence. The 
close matching to the overall letter and 
word frequencies for both groups in Con- 
ditions 17 and 18 and to overall word re- 
lations in Condition 19 is an expected 
incidental consequence of the order-learn- 
ing behavior. 

Figure 10 shows a tendency for Ss in the 
three conditions—20, 21, and 22—to 
match rather closely the statistical word 
frequencies of the sequences. Ss in these 
conditions learned the derivable rules at 
the word level as did Ss in Conditions 17, 
18, and 19 and demonstrated very close 
matching to letter events (again as an in- 
cidental consequence of the more complex 
order-learning behavior). Although there 
was nothing to be learned beyond word 
frequencies in these last three conditions, 
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Fic. 10. Comparison of Ss’ predictions with 
stimulus events in Conditions 20, 21, and 22 (Gen.: 
W 75, 50, 25; resp. mode: A,B; hist.: 8-8). 


we suspect that Ss were engaged in order- 
seeking behavior anyway. 


Discussion 


In a review of research using binary 
choice situations, Goodnow (1958) states 
some findings about general behavior of 
Ss. Much of the research she cites was con- 
cerned with modifying typical behavior 
in such situations. We are concerned with 
the typical behavior when Ss are not influ- 
enced in particular directions by instruc- 
tions or special payoff rules. 

Given rather vague instructions with 
freedom to interpret them as they wish, Ss, 
according to Goodnow, tend to treat the 
binary choice situation as a problem- 
solving situation. They tend not to regard 
the event series as random. Errors along 
the way are not important if ultimate suc- 
cess is arrived at. 

That Ss have only an imprecise notion 
of the experimental operations but suspect 
the existence of order and believe they are 
learning that order is supported by Levine, 
Leitenberg, and Richter (1964, p. 101). 
After 30 trials of a random event-contin- 
gent sequence, Ss were given 10 reinforce- 
ments regardless of their predictions. 
Levine et al. (1964, p. 120) state: 


One aside is worth recording here. All 120 sub- 
jects received the condition of 10 “rights.” Not one 
indicated any suspicions about the experimenter, 
Most, on subsequent questioning, expressed the 
idea that they “caught onto the sequence at the 
end.” 


For most individuals, correct prediction 
of an infrequent event is more of a success 
than that of a frequent, easier-to-get al- 
ternative. For this reason optimizing in 
terms of predicting the more frequent event 
is highly unlikely. 

Finally, Ss tend to test hypotheses di- 
rectly, or by making use of their prediction 
responses. 

From our observations, and from ques- 
tioning Ss after the experiments, we have 
found support for those statements. What 
are the ultimate consequences of such 
attitudes and behaviors with respect to the 
present series of experiments? Has any 
understanding been gained about the be- 
havioral processes which operate in such 
situations? 

The experiments reported above have 
used event-contingent, partly random, non- 
stationary sequences of binary events. 
The generating sources were more con- 
strained than a univariate independent 
source, but less so than a completely ordered 
sequence. 

Though the univariate generating source 
is very simple to describe mathematically, 
the sequence issuing from such a source, 
because it has minimum constraints, may 
be maximally complex. An infinitely long 
sequence will resemble the statistical char- 
acteristics of the source. Any finite sub- 
sequence, however, is likely to resemble à 
source with quite different gross character- 
istics, or its own source with additional 
internal constraints. 

Experimenters have rarely, if ever, asked 
from what generating source various prod- 
uet sub-sequences might have issued. Yet, 
in some sense, this is the task which has 
been levied upon Ss in experiments usmg 
such sequences. 

Even when our sources generate partly 
ordered sequences, the resulting sub-se- 
quences resemble a family of possible 
sources with more or less constraint. The 


wr 
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possible interactions occur with their 
chance frequencies and within any imposed 
constraints. The resulting sequence, then, 
is a product of the generating source 
rules and the chance interactions. 

If severe enough constraints (event- 
contingent rules) are placed on the gener- 
ating source, then a completely ordered 
sequence results. 

The decision to optimize or seek order in 
a sequence is not simple. Even if the tend- 
eney to optimize were present, it would be 
necessary to decide whether and when and 
how to optimize. This decision depends 
in part on the kind and amount of order 
in the sequence, how much has been found, 
and how difficult it is to find. 

Even if S had learned as much as it were 
possible to learn about sequences issuing 
from Generators 1, 2, or 3, he could not be 
certain that there might not be more un- 
discovered order. Put another way, any 
additional, unfound patterning might lead 
to a score better than that to be obtained 
by simple optimization. Balanced against 
the possibility of discovering further order 
is the cost of searching for it. 

With respect to the more complexly 
ordered word generators, the authors ini- 
tially surmised that Ss would show a learn- 
ing progression from details to more gen- 
eral aspects of the sequence. This was 
expected particularly in the word sequences 
in a progression from learning to identify 
words to learning the syntax. Though such 
a trend was indicated, learning seemed to 
overlap at both levels, with higher order 
rule learning underway before lower order 
rule learning was complete. The manifesta- 
tions of the various rules are interleaved in 
the event sequence. Naturally, since syntax 
characteristics are made from combina- 
tions of word events, opportunities to 
learn syntax rules will generally occur less 
frequently than will opportunities to learn 
word-construction rules. Nevertheless, syn- 
tax rule learning does begin before word- 
rule learning is complete, and some Ss 
learn a syntax rule before all word rules 
are mastered. Ss appear to proceed in the 
learning process from the general to the 
Specific rather than by the building block 


approach. Garner (1962, p. 305) hypothe- 
sizes that, "Perhaps... learning of sequen- 
tial characteristics is much easier when 
subjects can see relatively long sequences 
as a space pattern rather than as a time 
pattern.” This suggests that artificial 
memory, in the form of a display history, 
may have a crucial effect on the order in 
which the various rules are learned. No 
groups have been run on the word gener- 
ators without a display history, so Garner's 
hypothesis is yet to be tested in this sense. 

Accepting the interfering effect of the 
requirement to respond found by Galanter 
and Smith (1958, pp. 361-362) and by 
Bruner, Wallach, and Galanter (1959), 
there is reason to believe that there is yet 
another kind of response interference oper- 
ating. Assume that S is looking for order 
in a sequence of events. Given that he 
knows, or thinks that he knows, some of 
the characteristics of the sequence at any 
moment in time, it seems reasonable that 
his next prediction could be based on an 
optimization. This would especially be ex- 
pected if prediction responses were rela- 
tively independent of the events about 
which he is predicting. However, the pre- 
diction responses may serve another pur- 
pose, viz., to test (and verify or deny) 
some hypothesis S has at the time. It 
seems to be very difficult or impossible 
for S simultaneously to work at finding 
additional order and to optimize his cur- 
rent predictions on the basis of what he has 
already found out about the sequence. 

As indicated in the introduction, in try- 
ing to understand the behavioral process, 
the authors formulated three alternative 
hypotheses which are, unfortunately, not 
mutually exclusive. 

The first hypothesis was that Ss were 
attempting to infer the characteristics of 
the generating source from the sequential 
data presented to them. If this were so, 
the attempt failed: Ss were unable to de- 
scribe the basic source characteristics. 
They were not instructed to do so, but 
neither were they instructed to seek order. 
One still wonders, on the basis of prior 
findings (Goodnow, 1955; Goodnow & 
Postman, 1955), whether they could come 
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learning. other 
hand, Se’ verbal descriptions of what they 
had learned were at a different level from 


havior and the tendency to reproduce sam- 
ples of sequences when Ss were asked to 
deseribe what they had learned supported 
this hypothesis. We came to think of this 
phenomenon as analogous to descriptive 
generalization, or the describing of a sam- 
ple which is representative of all samples 
and violates the characteristics of none. Of 
course, the rules or characteristics dis- 
cussed under Hypothesis 2 would be con- 
tained in the molar descriptive sample. Ss 
were very good at reproducing a sample 
sequence which could have come from the 
sequence they had had. They could, in 
other words, simulate the generating source 
rather well. Ss first seemed to have learned 
general or molar (and at first inexact) 
characteristics of the sequence and then 
proceeded to learn the details in the form 
of the derivable local rules. 

As each rule was learned and some re- 
sponse uncertainty was reduced, Ss con- 
tinued to work at reducing their remain- 
ing uncertainty. Interestingly, once a rule 
had been learned, reinforcement simply 
confirmed that things had not changed. As 
Bower and Trabasso (1963) have stated, 
S can only learn additionally from being 
wrong or from negative reinforcements. 


Initial models of the stimulus-sampling 
type (Estes, 1950) assumed that each stim- 
ulus event, each predietion response, and 
each trial, respectively, were independent, 
There would be, if such were the case, а 
highly desirable neatness and simplicity to 
the behavioral process. Discussing the pat- 
tern model, Estes (1959, p. 12) states: 


But in some cases predictions from component 
models fail and it appears that a simple account 
of the course of learning requires the assumption 
that responses become associated, not with sepa- 
rate components or aspects, but with total pat- 
terns of stimulation considered as units. 


Pattern models, then, allow for total pat- 
terns of stimulation to be considered as 
units. However, responses are still defined 
as independent of one another and contin- 
gent upon a stimulus pattern present when 
each response occurs. 

Restle (1961, p. 109), discussing the 
binary choice experiment, presents a dis- 
cussion which so exactly states the opin- 
ions of the present investigators that it is 
reproduced below. 


However, the subjects’ comments about their 
attempts to solve such problems suggest an en- 
tirely different picture. The subject is confused in 
a way which accords with the stimulus-sampling 
theory, but whatever ideas he can formulate con- 
sist mainly of “the events seemed to come in 8 
sequence like bbybyy," or "I changed because 
there had been too many b's in а row," or 
usually shift when I am wrong and stay when 1 
am right,” ete. The subject seems to think that he 
is responding to patterns. Such attempts are nati- 
ral. The subject has no way of knowing that the 
events occur at random, and even if he is | 
that the sequence is random he does not under- 
stand this information clearly, nor is there any 
strong reason for him to believe it. Psychological | 
experimenters do not have a reputation for verae 
itv. If the sequence is random, and the subject 
knows it, he still cannot predict accurately, 
consistent accuracy is the natural aim in а simple 
experiment like this. But if the subject can idem 
tify the pattern, he will solve the problem. It will 
not be easy to convince the subject that there * 
no pattern in the sequence, since he has a gre# 
many hypotheses and, with no facilities for keep 
ing records, is quite unable to eliminate any п 
the simplest few decisively. 


The present series of experiments yield 
strong supporting evidence that Restle 
inferences about Ss are indeed the cas 


xe 


en there is а pattern to be identified, 
y tend to find it. 
¥ concept of S's going from the un- 
"enditioned state to the conditioned state 
"and remaining there (and especially in an 
гог-попе fashion) is not supported by 
havior of Ss in the present set of ex- 
nts. There is perhaps too strong a 
(among psychologists) that the per- 
nce of a token response is a perfect 
d valid measure of what has been 
med; that if a particular response is 
the experimenter can determine 
hether learning has or has not occurred 
@lely on that basis; that variation in at- 
ition won't matter; and that confusions 
occur. In a complex situation, learn- 
nay be gradual and may proceed from 
пе and uncertain state to more pre- 
and confident ones. 


CONCLUSIONS 


` The research described above has filled 
some of the gap between the random, uni- 
generating source and the com- 
pletely ordered or patterned sequence. In 
some sense, the generating sources used in 
the present studies could be considered а 
mixture of the other two kinds of sources. 
5 in the present studies demonstrated on 
whole more orderly behavior than do 

8з to random univariate sequences. 
It is important to state what the Ss did 


. not do. They did not work at inferring 


what the generating sources had been. 
They did not attempt to analyze the se- 
quence formally in order to learn the de- 
rivable local rules. 

At the most general level the Ss seemed 
to do something which was imprecise, sim- 
ple, and rather elegant. Their general be- 
havior can be accounted for if we describe 

process as learning to discern the re- 
Current, patterns of events, to predict that 
once a recurring pattern has begun it is 
likely to continue, and to shift to a differ- 
ent pattern when such a shift is indicated. 
It is apparent that the behavior observed 
in this research was not simple. Three 
avenues of further research suggest them- 
selves, 

First, the effects of a wider variety of 
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complex problem. An example is the 
of certain aspects of language learning which 


ing. If a word 
letter when it follows a word y, but 
when it follows a word z, what is 
process by which this rule is learned? 
Third, there were certainly great indi- 


between events. Clearly, the reasons for these 
differences between people are not under- 
stood, and further investigation is war- 
ranted. 

What implications about mathematical 
models of binary choice behavior can be 
drawn from this research? Current models 
do not reflect the interesting characteristics 
of the behavioral process of people con- 
tending with the sequences used in the 
present series of experiments. This, how- 
ever, does not say that models cannot de- 
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scribe the process well, but models, to be 
adequate, must consider the nature of the 


generating source against which the be- 
havioral process is being studied. 
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SEMANTIC DIFFERENTIAL PROFILES FOR 1,000 
MOST FREQUENT ENGLISH WORDS' 


Davm R. Herse” 
University of Chicago 


Semantic differential (SD) factor scores on the Evaluation, Activity, and 
Potency dimensions are presented for 1,000 most frequently used English 
words. Also given are the standard errors of the factor scores, the results of 
several reliability studies, and a listing (for all words) of 3 types of derived 
scores: polarizations, n Affiliation contents, n Achievement contents. Test- 
ing procedures and statistics on the sample of raters are detailed. Some uses 
of the dictionary are suggested, and an example of its use in a study of 
motivation is presented including empirical results. Conditions favoring 
further cumulation of SD data are discussed. 


К semantie differential (SD) has 
proven to be an accurate instrument 
for recording affective associations of stim- 
uli, particularly to the extent that such as- 
sociations are culturally or subculturally 
defined so that measurements may be aver- 
aged over groups of individuals (Norman, 
1959). In a wide variety of studies, includ- 
ing many involving cross-cultural samples 
of raters, it has been demonstrated that 
affective judgments on bipolar adjective 
scales reliably resolve into three major 
dimensions or factors which Osgood has 
named Evaluation, Activity, and Potency 
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(Osgood, 1962; Osgood, Suci, & Tannen- 
baum, 1957). Meaningful differences among 
words, sounds, colors, pictures, facial ex- 
pressions, and a wide variety of concepts 
have been found using measurements on 
these dimensions. 

The principles of SD methodology may 
be summarized as follows: 

1. Ratings on bipolar adjective scales— 
whatever the number and variety of scales 
used—are largely a function of a few di- 
mensions of judgment. 

2. These dimensions or factors are mean- 
ingfully related to affect. 

3. A few appropriate scales can be used 
to obtain reliable measurements on any one 
dimension. 

4. Measurements made on a given dimen- 
sion are comparable for stimuli of greatly 
different character (words, colors, sounds, 
ete.). 

The instrument’s usefulness has been 
recognized generally, and already applica- 
tions are too extensive and varied for re- 
view here. However, the present eclectic 
use of SD methodology as “а research tool" 
does not seem to exploit its potentialities 
fully. 

Unlike most present research instru- 
ments in the social sciences, the SD is 
amenable to standardized applieation in 
studies of personality, culture, and society. 
Using the SD, a systematie body of data 
can be assembled on the affective associa- 
tions of sociocultural elements in different 


groups. The existence and availability of 
such a collection of data would be valuable 
in facilitating new research, in stimulating 
theoretical developments, and as a hand- 
book with practical uses. 

Further, such materials can be assembled 
without special projects or great expense if 
investigators using the SD extend their 
individual efforts only slightly. The present 
study provides an initial fund of data and 
also serves as an illustration of the thesis 
of feasibility. 

Immediately following is a brief discus- 
sion of the original research out of which 
the present work grew. Next follows a 
description of the procedures and analyses 
involved in assembling the dictionary of 
semantie profiles given here. In thé third 
section, a return is made briefly to the 
research program outlined in order to illus- 
trate how such materials can be applied; 
other uses of the dictionary also are indi- 
cated. In the fourth section, some problems 
of accumulating data are discussed, focus- 
ing particularly on SD data. The dictionary 
of semantic profiles is presented as an ap- 
pendix. 


OPERATIONAL CONSIDERATIONS 


The dictionary presented here was as- 
sembled to facilitate another research 
study. This focal study partly determined 
the form of the dictionary, and thus a 
brief description is pertinent. 

On the basis of theoretical considera- 
tions, it was hypothesized that persons 
aroused in a given motivation (n Achieve- 
ment, n Affiliation, n Power, ete.) will use 
words whose affective connotations—as 
measured on the SD—are congruent with 
the given motivation. This hypothesis was 
operationalized roughly according to the 
following paradigm. (a) Arouse a person in 
à given motivation. (b) Take a sample of 
the words he emits while in this state. (c) 
Determine the affective connotations of 
these words in terms of the SD. (d) Score 
the word profiles for the motivation being 
considered. (e) See if the average motiva- 
tion score for the subject’s words is high as 
compared to the average score for words 
from an unaroused subject. 
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In order to test the hypothesis, a dic- 
tionary was necessary in which the SD 
profiles for emitted words could be “looked 
up." One such dietionary existed in pub- 
lished form: “An Atlas of Semantic Dif- 
ferential Profiles for 360 Words” (Jenkins, 
Russell & Suci, 1958). That publication 
was a stimulating factor in developing 
the research program described here. How- 
ever, words were included in the atlas on 
the basis of their psychological interest 
rather than on the basis of frequency of 
usage, and this fact, plus the relatively 
small size of the atlas, limited its adapta- 
tion, Hence, it was necessary to assemble 
the dictionary presented here. 


Selection of Words 


Considering the expense and effort neces- 
sary to acquire reliable SD profiles, it was 
infeasible that the required dictionary 
should include every English word. Nor 
was it necessary. In research focusing on 
subjects’ verbal behavior, one need not con- 
sider all words emitted but only repre- 
sentative samples. Further, according to 
the number-frequency phenomenon docu- 
mented by Zipf (1949) among speakers of 
any language, a few words occur very 
frequently and constitute a large proportion 
of the total number of verbal emissions. 

It was decided that a dictionary of 1,000 
words was both economically feasible, 
given frequency as the criterion for select- 
ing words, and adequate for research pur- 
poses. (The choice of the exact, rounded 
figure was arbitrary, of course.) 

Pilot studies indicated that the criterion 
of frequency could be profitably modified 
in two ways: 

1. By excluding “function words” from 
the dictionary. 

2. By treating meanings rather than 
words as the basic units. 

Some very frequent words (e.g., the, and, 
he, is, to) are function words, i.e., their 
emission in verbal behavior is determined 
mainly by grammatical requirements (Mil- 
ler, 1954). Function words are of little 
interest in SD work because their SD pro- 
files all tend to be neutral; this was evi- 
dent from pilot work, but an ex post faeto 
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analysis provides some quantitative evi- 
dence. 

In pilot work, 22 function words were 

rated. The difference between the mean rat- 
ing and the neutral point on each of the 
eight rating scales was determined for each 
of these words. A rough estimate of the 
rating variance for individual SD seales 
was obtained by randomly selecting 30 
words from the 1,000 dictionary words of 
the present study; the variances of the 
ratings on each scale for all 30 words were 
pooled to get an overall estimate of rating 
variance. (s? = 2.76. Scales and subjects 
were not identical to those used in pilot 
work, but the variability statistics for the 
scales in the two studies may be taken as 
approximately comparable.) From this esti- 
mate of scaling variance, the standard 
error of the mean scores in the pilot work 
was estimated (sz = .83; four subjects 
rated each word in the pilot study), and a 
table of normal probability was consulted 
to see how great a difference between the 
mean score for a word and the neutral 
point of a scale was necessary for signifi- 
cance at the .05 level in a two-tailed test. 
A significant difference was found to be 
1.66. Using this value as a criterion it was 
found that only 4% of the mean SD rat- 
ings for function words were significantly 
different from neutrality. The procedure 
was repeated for 22 randomly chosen con- 
tent words from the pilot study: 44% of 
the mean SD ratings for these words were 
nonneutral. 
, Function words were deleted from con- 
sideration in compiling the dictionary. A 
function word was defined operationally 
as any word which serves as an article, 
preposition, conjunction, pronoun, or verb 
auxiliary. Words which serve both as a 
function word and as a content word (e.g., 
the word “around” ean be a preposition or 
an adverb) were retained for consideration, 
and the meaning associated with usage as а 
content word was included in the dictionary 
if its frequency was appropriately high. — 

In studies using the SD, a distinction 
between words and concepts is crucial be- 
cause of the frequent case in which a sin- 
gle word designates several different con- 


cepts. This problem can be dealt with 
simply by defining the words used, both 
when presented as stimuli and when listed 
in dictionaries. In this way, the goal of 
having subjects rate a specific concept is 
more nearly achieved, and one knows 
definitely, when using the dictionary of SD 
profiles, whether a given concept is or is 
not represented. Concepts (a word plus its 
definition) were used as the units of analy- 
sis in this study. 


Dimensions To Be Measured 


Three factors, Evaluation, Potency, and 
Activity, typically account for the major 
portion of the common variance among SD 
scales, The repeated extraction of these 
factors with a wide variety of concepts 
and subjects from various cultures indi- 
cated that SD profiles certainly should in- 
clude measurements on at least these three 
dimensions. Quite often a fourth factor 
(which Osgood has named Stability) can 
be extracted. This factor accounts for less 
variance than do the first three factors, 
but it seemed to have potential relevance 
ior the primary study being conducted, so 
scales were included to measure this factor 
also. Though measurements on still other 
factors might have been of interest, the 
additional faetors typieally account for so 
little variance in factor analytic studies 
that it seemed uneconomical to treat them. 


PROCEDURE 


Preliminary Work 


Semantic Differential Instrument. Three con- 
siderations determined the general form of the SD 
instrument used in the study. 

1. It was decided in advance that 1,050 concepts 
would be scaled (50 concepts were to be scaled 
twice for information on reliability and other mat- 
ters), and also it was predetermined that about 340 
subjects would be available for 1 hour of rating 
time. Using the estimating formulas presented by 
Osgood, Suci, and Tannenbaum (1957, pp. 80-81), 
it was calculated that a total of 136,000 scaling 
judgments could be made, or (at most) 129 judg- 
ments per concept. 

2. To obtain adequate reliability, a sample of 15 
different raters per word was deemed minimal. 

3. To calculate factor scores relatively free of 
contamination by the unique variances of scales, 
at least two scales for each of the four factors to 
be measured were necessary. 


Given these conditions, the instrument for the 
study was determined as having eight scales; the 
conditions allowed either 15 or 16 subjects to be 
used per concept; naturally the larger number was 
chosen. 

Two small-scale pilot studies were run as an 
aid in choosing the actual scales to be used. On 
the basis of correlations demonstrated in these 
studies and considering the factor loadings pre- 
sented for scales in published works, the following 
scales were chosen: 


Dimension Scale 

Evaluation Good-Bad 
Pleasant-Unpleasant 

Activity Active-Passive 
Lively-Still 

Potency Strong-Weak 
Tough-Tender 

Stability Rational-Emotional 


Tamed-Untamed 


List of Concepts. West’s (1953) frequency count 
of semantic units in English was used as reference 
in compiling the list of concepts. For the task, the 
West book was the most adequate source even 
though it contains known flaws (see Rosenzweig 
& MeNeill, 1962). Percentages were converted to 
frequencies for the present work. 

Each word concept with a frequency of at least 
300 usages per 5 million word occurrences was 
entered into a preliminary file resulting in a list 
of 1,047 units. By raising the critical frequency to 
337/5-million, the list was reduced by 97 units to 
950 word concepts. Then 39 more units were 
dropped on a subjective basis (the deleted words 
are listed in Appendix A); these were concepts 
which, though listed as having frequencies greater 
than 336/5-million, seemed unlikely to occur in 
the context of brief, extemporaneous stories. The 
final list drawn from the published semantic count 
then amounted to 911 concepts. 

RU was anticipated that some word concepts 
might appear in extemporaneous stories more fre- 
quently than they do in the formal kinds of writ- 
ings on which published frequency counts are 
based. In order to adjust the list for this possibil- 
ity, a short frequency count (covering about 8,000 
word occurrences) was made of words appearing in 
a published collection of stories (Atkinson, 1958, 
Appendix 1). All concepts which were used at least 
twice (once in two different stories written to 
different picture stimuli) were included in the 
final list, This added 85 units to the 911 units al- 
ready compiled. Of these 85 additional units, how- 
ever, 40 were among those dropped from the 
original file for having frequencies less than 337/ 
5-million though greater than or equal to 300/5- 
million. Finally, four additional words (admiral, 
enlist, navy, sailor) were included, which, it was 
believed, might have frequent use among Navy 
enlistees (the subjects to be used in the experi- 
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ment). The final list then contained exactly 1,000 
semantic units, and in the list were 778 diferent 
words. 

Definition of Concepts. Pilot work had indi- 
cated that definition of word concepts could not 
be achieved through the use of synonyms since 
the mere presence of other content words contami- 
nates the affective connotation of a stimulus word. 
Pilot work also had indicated, however, that pres- 
ence of function words has relatively little effect 
on the affective connotation of a stimulus. Thus 
it was feasible to define each word concept by 
giving an example of its use in a sentence com- 
posed otherwise of function words only. A 67-word 
vocabulary of function words was used in con- 
structing defining sentences (see Appendix B); 
these words alone sufficed to define 90.9 percent 
of the 1,000 semantic units on the list (for 91 
entries, use of a nonfunction word was required 
to make the concept’s meaning clear). A few words 
on the list do not have multiple meanings: these 
words, of course, did not require defining sentences, 
but sentences were provided as a control measure. 

Verbs were defined by sentences in which the 
verb was used in the simple past; nouns were de- 
fined by sentences in which the noun was used in 
the singular (except in a few cases where this 
seemed awkward and opposed to common usage). 
The length of each sentence was restricted by the 
requirement that the word plus its defining 
sentence, including punctuation and blanks, could 
not exceed 36 spaces in length—this restriction was 
necessary since words and defining sentences were 
to be keypunched into tabulating cards. 

Preparation of Stimulus Cards. Mark-sense tech- 
niques were used in data collection, thus eliminat- 
ing steps of coding, transcribing, and manual key- 
punching of the SD data gathered. Responses were 
recorded by subjects with an electrolytie pencil 
on mark-sense tabulating cards prepared especially 
for the study. These cards presented both the 
concept to be rated and the set of SD scales. 

In the first 36 columns of а card the word con- 
cept and its defining sentence were keypunched, 
these punches being interpreted at the top of the 
card to provide the printed stimulus. All eight SD 
scales were preprinted toward the middle of the 
cards; each pair of adjectives appeared on a sepa- 
rate row; and adjectives were separated by seven 
mark-sense positions, thus defining the standard 
7-point scale. 

The following steps were followed in preparing 
packets of cards for the subjects. First, the 1,000- 
unit list of concepts and defining sentences (and 
serialization numbers for alphabetic sequence) was 
keypunched to form a 1,000-card master deck. This 
master deck was reproduced and interpreted оп 
the mark-sense cards 16 times (since 16 subjects 
were to scale each concept). Packets of 50 cards 
were then sorted out by machine such that each 
packet contained every twentieth word in the 
master deck (different packets began with different 
Serialization numbers, e.g., 001, 002, 003, etc. up to 
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(20). This procedure of taking every twentieth 
word from a list in alphabetical order seemed a 
satisfactory substitute for the more difficult tech- 
nique of drawing cards randomly in order to make 
up packets. To prevent the order to cards within 
packets from having systematic effect on ratings, 
the cards in each of the final 320 packets were 
shuffled by sorting on various alphabetic columns. 


Fieldwork 


Subjects and Sampling. Subjects who served as 
raters were Navy enlistees enrolled in a 16 week 


training program at the Hospital Corps School, 
Great Lakes Naval Training Center, Great Lakes, 
Illinois. The population's average IQ із 1105 (as 
estimated by summing scores on the Navy's Gen- 
eral Classification Test and Arithmetic Test), edu- 
cation averages 11.9 years, and the average age is 
18.9 years. These population statistics are based on 
2,621 cases distributed over about 3 years. Addi- 
tional statistics on the sample of raters are pre- 
sented in Table 1. 

Participation in the project was mandatory for 
all trainees in the school at the time (except 


TABLE 1 
Descriptive STATISTICS ON SAMPLE OF RATERS 
(N = 342) 
Demographic variable % Demographic variable % 
Age in years Father's occupation 
17 2 Farmer or farm worker 2 
18 35 Unskilled 4 
19 32 Service 6 
20 11 Semiskilled 1 
21 9 Skilled 34 
22 4 Clerical 5 
23 3 Sales 5 
24 1 Proprietor or manager 12 
25, 25+ 1 Professional 14 
No answer* 3 No answer* 6 
Father's edueation Family income in dollars® 
No schooling 0 Less than 3000 E 
Less than 5th grade 2 3000-4999 
5th to 8th grade 18 5000-7499 4 
Some high school 20 7500-9999 a 
Finished high school 34 10000-14999 - 
Some college 12 15000 and above > 
Finished college 6 No answer 
Graduate work 3 
No answer* 4 
Home town Geographic origin 
d 9 
Farm or open country 10 New England | 
Suburb in urban area of: Middle Atlantic D 
Less than 100000 11 East North Central 
100000 to 499999 6 West North Central и 
500000 to 1999999 6 иһ ud : 
2000000 or more 6 South Centra 1 
City in urban area of: ES ыр 0 
GRE. pes 17 Alaska or Hawaii 2 
100000 to 499999 7. посе 
500000 to 1999999 8 
2000000 or more 7 
No answer* 2 


Note.—Figures sum to 100 + 1% within each variable. 


a “No answer" includes “failure to answer, 
^ Before taxes. 2 Г 
* Here two non-U.S. citizens are included in 


ээ “refusal to answer," and don't know." 


“No answer” as well as the categories listed in footnote 
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those with conflicting official duties during testing 
periods). к 

Rating Sessions. Subjects did the ratings in а 
classroom in the administration building of the 
Hospital Corps School, Great Lakes Naval Train- 
ing Center, Great Lakes, Illinois. Eleven 1-hour 
sessions were arranged in the late afternoon and 
evening during April and May 1963. Attendance 
at each session varied from 20 to 70, with about 35 
persons attending each session on the average. The 
experimenter was a male civilian, age 26. 

On entering the room, subjects were given a 
packet of IBM cards wrapped in a sheet of paper 
and a mark-sense pencil. When all subjects for 
the session were seated, the experimenter in- 
structed them to fill out a questionnaire concern- 
ing their social background. Then the experimenter 
gave the instructions for rating words: 


Now let's go on to the instructions for filling 
out the cards. The purpose of this research is to 
make a dictionary of the emotional meanings of 
words. The words you're rating are the 1,000 
most frequently used words in English. Each of 
you has 50 of these words. A regular dictionary 
tells what a word refers to—what it means logi- 
cally. We want to make a dictionary that tells 
what kind of feelings are associated with words 
—what the emotional meanings of these words 
are. To do this we're having people rate the 
words against the adjectives printed on the 
cards. 

You'll notice that in the upper left-hand 
corner of every card a word is printed. [E points 
to the stimulus word on a large demonstrator.] 
There is a sentence in parentheses following 
right after the word. The purpose of the sentence 
is to tell you in what sense to take the word. 
Lots of words have a number of different mean- 
ings. The word BEAR for example can mean “to 
carry something” or it can mean “an animal.” 
On your instruction sheet the word BEAR is fol- 
lowed by the sentence, “That is the Bear.” This 
sentence makes clear that the word here refers 
to an animal. In the same way, the sentences 
following the words on your cards are to clarify 
the meaning of the words you're rating. From 
the sentence you can get an idea of the sense 
in which to take the word that is printed in the 
corner of the card. Once you are clear on the 
meaning of the word, you are to ignore the 
sentence and rate only the word. The sentence is 
there only to help you figure out precisely what 
the word means. 

Printed on the center of the card are eight 
pairs of adjectives. [E reads off the adjectives.] 
Between each pair of adjectives there are seven 
spaces. You are to rate the words by putting a 
mark in the appropriate space between each pair 
of adjectives. Using these spaces you can show 
which adjective in a pair better fits the word 
you are rating and how well it fits. For example 
the person who rated the word on this card, 


TORNADO, thought that a tornado is.... [E in- 
terprets each mark on the demonstrator сагі.) 
The case here of the person rating a tornado as 
slightly emotional points out something. The 
adjectives will not always make logical sense 
when applied to the words you are rating. You 
are to make your ratings on the basis of what 
you feel is the best fitting rating rather than 
what is logical. Rate on the basis of your first 
impressions. 

Notice that if a word doesn’t mean something 
to you, there is a way to show that. Put a mark 
in the zero position. Not all words are emotional, 
and you can show that by marking them neutral, 

The marks you make are to be converted to 
punches by machine. Each mark has to carry an 
electrical current. Every mark you make is, in 
effect, a printed circuit. Therefore, be sure to 
make your marks heavy and black. I repeat: 
make your marks thick and dark. And keep the 
marks within the rounded brackets. 

Are there any questions? Then go ahead and 
rate the words. You may leave when you are 
through. 

Before the overall program of testing was 
completed, all packets from the dictionary sessions 
were examined. Packets which were incompletely 
marked or which obviously had been faked* were 
reproduced and given back to new subjects. 


ANALYSES 


Cards in the packets received back from 
subjects were punched using a mark-sense 
reproducer. Since SD scales were printed 
as rows rather than as columns on the 
cards, the data were punched row-wise 
rather than column-wise. In order to pre- 
pare the data for use with standard com- 
puter programs, the data punched in rows 
were transposed to columns, using a spe- 
cial program written for the IBM 1401 
computer. 

After data collection and preliminary 
machine processing, there were approxi- 
mately 16,500 punched cards, each card 
being a record of one person's ratings of 
one concept on the eight scales. Using the 
IBM 7090 computer, the average profile 
over the eight scales was calculated for 
each of the concepts in the dictionary. 
These mean scores were the materials used 
in further analyses. 


“Faking was detected by regular appearance 
of geometric patterns in the rating marks and by 
ratings of “neutral” words as extremely polarize 
and vice versa. At least a third of the cards in ® 
deck had to demonstrate such characteristics be 
fore the deck was rejected. 
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Factor Analyses 


To validate using the chosen scales as 
measures of Evaluation, Activity, Po- 
tency, and Stability, correlations among 
the eight scales were obtained over the 
1,000 observations in the dictionary, and 
this matrix of correlations was factor 
analyzed. Factors accounted for 69% of 
the total variance, and three factors ac- 
counted for all of the common variance; 
these factors were clearly recognizable as 
Evaluation, Activity, and Potency. The 
two scales meant to measure the Stability 
dimension showed almost zero correlation 
with one another; one of them loaded 
heavily on the Evaluation factor, the other 
on the Potency factor. 

Examination of the third and fourth 
moments of the scale means (for the 
1,000 dictionary concepts) indicated that 
the distributions on all eight scales were 
highly skewed and peaked (skew and 
kurtosis measures for the scales were all 
significantly different from zero). Since 
product-moment correlations depend on the 
assumption that scores are normally dis- 
tributed, it was possible that correlations 
based on these clearly abnormal distribu- 
tions might be distorted and that the factor 
analysis results might therefore be mis- 
leading. Scales were transformed (using a 
square root transformation—see Walker 
and Ley, 1953, p. 424) so that all distribu- 
tions of scores approached normality, and 
the correlation and factor analyses were 


run again. The results of the factor analysis 
based on the transformed scores are pre- 
sented in Table 2 along with the results 
of the factor analysis based on untrans- 
formed scores. In the second factor analy- 
sis, three factors—Evaluation, Activity, 
and Potency—again accounted for all of 
the common variance (Potency accounted 
for somewhat more of the variance in the 
second analysis), and the pattern of factor 
loadings was nearly the same as in the 
first analysis. It was concluded that the 
deviations from normality did not sig- 
nificantly affect the validity of the factor 
structure found in the first analysis. 

Since a Stability factor accounted for 
none of the common variance in the factor 
analysés of scales, consideration of this 
dimension was discontinued for the re- 
mainder of the study. The Stability scales 
were reassigned to the factors on which 
they loaded empirically. 


Calculation of Factor Scores 


Regression equations for calculating fac- 
tor scores were derived by the short method 
given by Harman (1960, pp. 349-356). 
Evaluation and Activity factor scores were 
based only on scales loading on these fac- 
tors, respectively. However, Potency factor 
scores were corrected for Evaluation and 
Activity contamination, and thus the re- 
gression equation for this factor includes 
several non-Potency scales. The equations 


TABLE 2 


Factor Loaprnes or SD SCALES, 


віха UNCORRECTED SCORES AND Scores 


CORRECTED FOR SKEW AND KURTOSIS 


(Based on corr 


elations for 1,000 dietionary concepts) 


Scores uncorrected 


Scale Factor 
1 2 Ü 
Tough-Tender ШЕ DERE LASER 
Still-Lively 07 b pied 
Pleasant-Unpleasant .88 —.20 2 
Untamed-Tamed E M NES 
Strong-Weak AT —.70 ТЕ 
Passive-Active — .04 89 g^ 
Emotional-Rational .09 09 s 
Good-Bad 90 —.28 05 


Note.—The factor matrices are the result of machine rotation us 


Scores corrected 


Factor 


: 
X ПЕ gant cy cmt 

79 —.58  —.30 —.60 .79 
70 .07 88 —.08 70 
86 .8 —.22 25 .86 
59 M nok). — 09 .58 
.60 Лә AO -.M 64 
79 —.05 .88 03 78 
.35 di .00 54 .30 
.88 S8 —.30 07 .88 


ing the varimax criterion. 
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used to convert the scale means into factor 
scores were as follows. 


Evaluation factor score 


— 297 (UP) + .338(TU) 
— 458(BG) + 1.067 
938 


Activity factor score 
_ .356(LS) + .822(АР) — 5.657 
n 908 


Potency factor score 


_ —,083(TT) = .326(LS) — 118(%/8) 
= 2p 420(RE) — 334(BG) + 3.973 
798 


The initials refer to SD scales as indicated 
below. 

1. TT: Tender-Tough 

2. LS: Lively-Still 

3. UP: Unpleasant-Pleasant 

4. TU: Tamed-Untamed 

5. WS: Weak-Strong 

6. AP: Active-Passive 

7. RE: Rational-Emotional 

8. BG: Bad-Good 

The factor scores presented in the dic- 

tionary can be considered independent 
measurements. The maximum correlation 
between any two sets of factor scores is .17 
(this being between Evaluation and Po- 
tency). Scores presented in the dictionary 
(Appendix C) are fully standardized: 
summing over all dictionary concepts, 
means of the factor scores are zero and 
standard deviations are 1. 


Standard Error of Factor Scores 


In order to obtain an estimate of the 
average standard error of the SD factor 
scores, 15 words were selected from the list 
of 1,000 words (using random numbers) 
and scaled a second time by subjects from 
the same population. These ratings were 
made during regular sessions, and subjects 
who received the reliability-study packets 
were unaware of their special role. Just as 
with the words in the dictionary, the rat- 
ings for these 15 words were converted to 
mean scale scores and then to factor 
scores. Hence, for 15 randomly selected 


TABLE 3 
ANALYSES OF VARIANCE OF SD Scores FOR 15 RAN. 
ромит SELECTED WORDS, WITH Error TERMS 
Basep on Two SCALING REPETITIONS 


Source of variation df SS MS 


Evaluation factor scores 


Words 14 35.935 2.567 

Repetitions 15 3.409 .227 
Total 29 39.344 
Activity factor scores 

Words 14 15.406 1.100 

Repetitions 15 2.440 .163 
Total 29 17.846 
Potency factor scores 

Words 14 26.070 1.862 

Repetitions 15 4.033 .269 
Total 29 30.103 


words, two SD profiles were available—one 
from the regular dictionary work and an- 
other from a second group of subjects. 

These two sets of data were combined in 
one-way analyses of variance (one separate 
analysis was carried out for each factor 
score—Evaluation, Activity, and Potency). 
In these analyses of variance, words repre- 
sented the different variables (or levels)— 
hence there were 15 variables in each, and 
the two sets of factor scores for each word 
constituted repetitions. Thus, three analy- 
ses of variance were carried out, each with 
15 variables and two repetitions (se 
Table 3). 

In these analyses the error variance 
based on repetitions provides a basis for 
estimating the standard error of the facto! 
scores. The error variance of the Evalua- 
tion factor scores based on repetitions 1! 
.227; therefore an estimate of the standard 
error of Evaluation scores in the dictionary 
is .48 (ie. the square root of .227). Simi- 
larly, on the basis of these analyses 0 
variance, the standard error of Activity 
scores can be estimated as .40 and the 
standard error of Potency scores as .52. 

These estimates of the standard erto 
(combined with the fact that every sco 
in the dictionary is based on ratings 
16 subjects) indicate that any two fact? 
scores in the dictionary which differ by ^ 
much as 1.00 units may be taken as 8 
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nificantly different at the .05 level (in a two- 
tailed test). 


Effects of Defining-Sentences on SD Scores 


In preparing words for rating, several 
conventions were adopted. Nouns were de- 
fined by usage in the singular form in 
defining sentences; verbs were defined by 
presenting them in the simple past tense. 
In so doing, it was assumed that plurality 
in the case of nouns and tense in the case 
of verbs have no systematic effect on 
words’ connotative meaning. Also it was 
assumed that, while a defining sentence 
clarifies denotative meaning, its mere pres- 
ence does not affect a word’s connotation. 
In the course of the study, data were 
gathered to test the validity of these as- 
sumptions. 

The subjects who rated the 15 words 
used in deriving standard errors of factor 
scores also rated 35 other words, specially 
selected and prepared as follows. 

Verbs. Fifteen verbs were selected ran- 
domly from the dictionary list. For 10 of 
these, defining sentences were written with 
the verb appearing in the perfect tense; for 
the other 5 the sentences were written with 
the verb in the present tense. Otherwise the 
defining sentences were the same as those 
used in the dictionary work. 

Nouns. Ten nouns were randomly se- 
lected from the dictionary list. For each of 
these, the defining sentence was phrased so 
that the plural form of the noun was used 
instead of the singular. 

Single-Meaning Words. Ten words were 
picked from the dictionary, each of which 
has but a single meaning or else has second- 
ary meanings which are extremely rare 
(according to information in West’s [1953] 
semantic frequency count). In presenting 
these words as stimuli, no defining sentences 
at all were provided. 

Ratings for all these special stimuli were 
converted to factor scores in the usual way. 
Hence, for each of the words treated, two 
SD ratings were available: one from the 
dictionary work and one from the second 
set of ratings. These two sets of data were 
merged into a series of analyses of vari- 
ance as deseribed in the preceding section. 


TABLE 4 
ANALYSES Or THE EFFECTS Or DEFINING SEN- 
TENCES ON SD RATINGS Or WORDS: VARIANCES 
DUE TO TREATMENTS-PLUS-ERROR COMPARED 


Source of variance 4f MS F 


Evaluation factor scores 
Treatment in sentence: 


Perfect tense 10 .075 .330 
Present tense 5 .230 1.013 
Plurals 10 197 . 868 
No defining sentence 10 ‚183 ‚806 
Error variance* 15 .227 
Activity factor scores 
Treatment in sentence: 
Perfect tense .237 1.454 
Present tense 5 ‚094 ‚577 
Plurals 10 .280 1.718 
No defining sentence 10 .370 2.270 
Error variance* 15 ‚163 
Ро{епсу factor scores 
Treatment in sentence: 
Perfect tense 10 .238 885 
Present tense 5 154 ‚572 
Plurals 10 .100 .372 
No defining sentence 10 .202 ‚751 
Error variance* 15 ‚269 


чылы ту ج‎ 
Note.—F os = 2.55 for nı = 10 and n: = 15. 
5 Based on simple repetitions of scaling pro- 
cedure (see Table 3). 


Again, the “error” variance in each analy- 
sis was of special interest: it constituted a 
measure of the average difference between 
the factor scores for dictionary words and 
the factor scores for the specially treated 
words, If the treatments had any system- 
atic effects on SD ratings, then this error 
variance would be larger than expected 
because of mere sampling variability. The 
error variance in such an analysis would 
be inflated since it would be composed of 
both actual error variance and variance due 
to the treatment. 

Estimates of the actual error variance 
due to simple repetition have been derived 
in the preceding section for each type of 
factor score. These estimates of true error 
variance can be used as a base in compari- 
sons with the variances obtained in this 
section using the F statistic. Thus it can 
be determined if the error variances derived 
in this section are significantly larger than 
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the true error variances estimated in the 
preceding section. 

The end results of the analyses are pre- 
sented in Table 4. None of the special 
treatments are associated with significantly 
large F values. The single case of near sig- 
nificance is the effect of “no defining sen- 
tence” on Activity scores. In this case, a 
more detailed analysis indicates that the 
influence (if any) of mere presence of a 
defining sentence is not uniform. Words 
with defining sentences are rated neither 
consistently more active nor consistently 
more passive than the same words with- 
out defining sentences. (Student's t equals 
.117 in the appropriate statistieal test.) 

It ean be concluded that the SD ratings 
presented in the dietionary would be sub- 
stantially the same even if: (a) verbs had 
been defined using some tense other than 
the simple past; (b) nouns had been defined 
in the plural form; or (c) definition had 
been achieved somehow without using de- 
fining sentences. 


APPLICATIONS 


Once the dictionary was assembled, it 
became possible to continue with the origi- 
nal research. A preliminary study was con- 
ducted whose procedures and results may 
serve as an illustration of the dictionary’s 
use. 

Subjects from the same population as 
those who had done the dictionary ratings 
were asked to make SD ratings for de- 
scriptions of two motivations—n Affiliation 
and n Achievement. Each motivation de- 
scription listed a series of activities which 
are characteristic of persons aroused in that 
motivation, Averaging subjects’ ratings for 
these descriptions and converting to SD 
factor scores yielded profiles which repre- 
sented the two motivations in terms of SD 
dimensions. 

These two profiles were used as reference 
profiles in calculating motivation scores for 
the words in the dictionary. In calculating 
word scores, all SD profiles were treated as 
points in a three-dimensional space. Moti- 
vation scores were assigned to words on 
the basis of their “distance” (i.e, D score; 
see Osgood et al., 1957, pp. 90-97) from 


the motivation reference points. The moti. 
vation scores thus calculated are included 
as part of the dictionary presented here, 
(Further details of the calculations are 
given in Appendix C.) 

These motivation scores for words were 
used to score a set of published stories 
(Atkinson, 1958, Appendix I) for both n 
Affiliation (n Aff) and n Achievement (n 
Ach). To score a story, a list was made 
of the word concepts appearing in the story 
which also were in the dictionary (repeated 
usages of the same word concept in the 
same story were ignored). For each story 
and for both n Aff and n Ach scores, the 
mean motivation score of the words in the 
story was calculated. A correlation analy- 
sis was run comparing these mean word- 
scores with the published motivation scores 
for the same stories based on imagery scor- 
ing. The hypothesis of the focal study pre- 
dieted a positive correlation between the 
two types of motivation scores—those 
based on the dictionary of SD profiles and 
those derived by the independent tech- 
nique of imagery scoring, and this was the - 
case: for n Aff, r = .43, and for n Ach, 
т = 40 (N = 69 and p < 001 in both 
cases). Such results indicate that the SD 
profiles presented in the dictionary have 
real meaning as a basis for psychological 
research. 'The technique of calculating 
motivation scores demonstrates one way in 
which the materials can be put to use. 

Some other uses for the dictionary are . 
also evident, Using the dictionary as a 
source of data, a variety of psycholinguistie 
and social psychological experiments are 
possible: studies of phonetic symbolism, 
Studies of factors related to polarization 
(emotionality) of words, studies of role 
images in the family (father, son, sister, 
etc.) are all possible using the materials 
presented. Additional studies could be de- | 
veloped by combining the materials with | 
additional data (the above derivation of 
profiles for motivations serves as illustra- 
ton of such a procedure). As a handbook, 
the dictionary could serve as a useful re- 
search aid, as, for example, in balancing | 
the social desirability of items in question- — 
naires. As a sample of ratings indicating 
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the affective connotations of words for а 
well-defined population of subjects, it 
could stand as one of a set of research 
dictionaries, such as the cross-cultural 
series planned by Osgood (1964). 


Dara CUMULATION 


Efforts toward data cumulation are 

underway in behavioral sciences. For ex- 
ample, in anthropology there exist the 
Human Relations Area Files and Mur- 
dock’s (1957) “World Ethnographie Sam- 
ple” (which is being continuously improved 
and extended through a special department 
in the journal Ethnography.) The Ameri- 
can Documentation Institute provides a 
depot for raw and partially summarized 
data which someday might be organized 
into useful and accessible reference ma- 
terials. Also, here and there in the litera- 
ture, compendiums of data have been as- 
sembled which are of considerable value to 
researchers, e.g., Hilgard’s (1951) presenta- 
tion of the association values of nonsense 
syllables. Such efforts gain their signifi- 
cance from the fact that data cumulation 
reduces redundancies of effort and thereby 
can lead to more efficient use of resources 
and accelerated progress in research. 
: Regretfully, much of the data gathered 
in psychology is not subject to cumulation, 
because of lack of standardization of meas- 
uring procedures. Though standardization 
in many areas is still infeasible and in 
some instances undesirable, SD data can 
be treated as standardized or semistand- 
ardized (ie., different sets of data can be 
made comparable through mathematical 
translations). Hence, fulfillment of the 
following two additional requirements 
would allow eumulation of SD data to 
begin. 

Systematic Choice of Stimuli. If investi- 
gators using semantic differentials follow 
systematic criteria in choosing concepts 
for scaling, useless redundancies and un- 
Measured voids can be avoided in the 
cumulation process. For example, if an in- 
vestigator is intending to present the 
images MOTHER and SISTER to a group of 
Subjects, he should include other family 
figures as well, even though he himself 


may have no immediate use for this addi- 
tional data. 

Specification of Sample of Raters. Data 
should be presented concerning the condi- 
tions under which measurements were made 
and concerning the sample of raters. In 
this way, information is available so that 
at some later time studies can be grouped 
on the basis of experimental variables and 
the populations of raters. 

How does the present study fit these 
criteria for data cumulative-ability ? 

It was with standardization in mind that 
the scales Good-Bad, Active-Passive, and 
Strong-Weak were chosen. These had been 
widely used and verified in previous works 
as useful measures of the three major SD 
factors. (It later was found that Osgood is 
developing a set of standard SD scales for 
American subjects based on pancultural 
factorizations. The ratings reported here 
should be convertible mechanically to 
“standardized ratings," i.e., ratings equiva- 
lent to those which would be attained with 
Osgood's standard seales, with a reason- 
able degree of precision.) Further, measure- 
ments are reported as factor scores. 
Through this procedure, the effects of the 
unique variances of scales are minimized, 
and measurements can be taken as rela- 
tively comparable to those of other studies 
even though based on different scales. 

The criterion for selecting concepts was 
frequency of use. All word concepts which 
are listed as having a frequency of 337/5- 
million or more in West’s semantic fre- 
quency count are given here with the excep- 
tion of the words listed in Appendix A and 
the well-defined class of function words. 
Thus, if it is decided to extend the size of 
the dictionary using the same frequency 
criterion for selecting words there is little 
danger of redundancy or gaps. 

Full details on procedure have been 
given in this report. A questionnaire was 
administered to all subjects who served as 
raters specifically in order to define the 
social characteristics of the sample of 
raters. Information was gathered on age, 
socioeconomic status, urbanization, and 
geographic origin. These data are presented 
in full as Table 1. 
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The dictionary presented is a start in with little additional effort, investigators 
the process of systematically accumulat- pursuing their own interests can contribute 


ing SD data, The study illustrates how, 


to the data cumulation process. 
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APPENDIX А 


All of the following word concepts are listed by West (1953) as having frequencies greater than 
336/5-million. However they were deleted on a subjective basis from the list because: (a) they seemed 
unlikely to appear in extemporaneous stories, (b) they seemed unlikely to be a part of the working 
vocabulary of the subjects being tested, and/or (c) the frequency cited for them appeared to be grossly 
inaccurate on consulting other published frequency counts. 


Account (That is his account of it.) 
Age (What is his age?) 

Air (It is air.) 

Animal (There is the animal.) 
Board (He is on the board of directors.) 
Case (That was the case then.) 
Church (It is about the church.) 
Clothing (His clothing is there.) 
Clouds (They are clouds of war.) 
Coal (That is coal.) 

Coin (He has some coins.) 

Corn (He has some corn.) 

District (It is in that district.) 
Earth (It is of the earth.) 

Figure (He added the figures.) 
Figure (The figure is on page 50.) 
Gold (That is gold.) 

Industrial (It is industrial.) 
Language (That is his language.) 
Leaf (It is about leaves.) 


Number (It is the number 50.) 
Observation (It takes observation.) 
Press (He is from the press.) 
Railroad (There is the railroad.) 
Record (He recorded their times.) 
Size (It is that size.) 

Song (It is a song.) 

Study (There is a study of it.) 
Sugar (It is sugar.) 

System (He has a system.) 
System (It is a system of ideas.) 
Table (It is on the table.) 

Today (It is so today as always.) 
Union (They have union.) 

Up (It is up to them.) 

Upper (It is the upper one.) 
Village (He is in the village.) 
Weight (That is its weight.) 
Year (It is once a year.) 


M Davi» R. Herse 


APPENDIX B 


Following are the words used in constructing the definition sentences for words in the dictionary, 
It was necessary to include some content words: nouns (50, idea, number, person, street, thing, and 
time) were chosen for their maximal utility and their minimal affective content: verb forms (be, do, 
have) were chosen because they commonly serve as auxillaries and thus might be expected to be more 
neutralized than other verbs. Using only the following 67 words, 90.9 percent of the words were defined, 


a, an here some 
about idea street 
along in than 
all into that 
and it the 
as not then 
at now there 
away number these 
be, are, is, was, were, will of they, their, them 
by often thing 
do, did, does, doing, done on this 
50 one those 
for out time 
from own to 
has, had, have person up 
he, him, himself, his same with 
her 
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APPENDIX C 


In the following dictionary the affective con- 
tent of frequent words is indicated through a 
series of numerical indexes. The first four num- 
bers following any word represent customary SD 
information. The number listed in the “Eval” 
column is the word's score on the Evaluation di- 
mension, the number in the “Actv” column is 
the word's rating on the Activity dimension, and 
the number in the *Potn" column is the word's 
rating on the Potency dimension. The number in 
the “Plr” column is the word's polarization or 
"distance" from neutrality in the semantic space: 
it is obtained by squaring and adding the first 
three scores and taking the square root of the 
sum. 

The SD scores given are standardized factor 
scores. They were computed by regression equa- 
tion and have a high degree of orthogonality. For 
Evaluation, positive scores mean good. For Ac- 
tivity, positive scores mean active. For Potency, 
positive scores mean tough (strong, hard). 

The numbers in columns five and six are meas- 
ures of the words’ n Affiliation and n Achieve- 
ment content. A position in the semantic space 
was located to represent each of these motiva- 
tions. 


Eval 
ABLE (He is an able person) 1.45 
ABOUT (There are about 50 of them) —1.75 
ABOVE (It is above that thing) 0.05 
ACCEPT (He accepted the things) 0.80 
AccEPT (He accepted the ideas) 0.77 
ACROSS (It is across the street) 0.36 
act (He acted on the idea) 0.47 
ACT (It was an act, not an idea) 0.35 
ACTION (It was action, not an idea) —0.37 
ACTUAL (The actual number was 50) —0.14 
ADMIRAL (He is an admiral) 0.58 
apmrr (He admitted it) 2d 
ADOPT (He adopted their ideas) 0.54 
ADVANTAGE (He had an advantage) 0.71 
AFFAIR (It is his own affair) 0.23 
AFFECT (It affected him) —0.68 
AGAIN (He did it again) —0.81 
AGAIN (He is himself again) 0.52 
Або (It was some time ago) aa 


AGREE (He agreed to it) 


ALL (All were there) 0.23 
ALL (It was all his own idea) —0.02 
ALLOW (He allowed it) 0.05 
ALMOST (There are almost 50) —0.11 
ALONE (He is alone) —1.96 
ALONG (It is along the street) —0.72 
ALREADY (He already has it) —0.60 
ALSO (He also has it) ean 


ALWAYS (He always does it) 


The profile of the motivation reference point 
сап be represented as E.., Aw, Pa . The profile 
of a word to be scored can be represented as E, , 
A. , P. . The distance between the word and the 
motivation reference point is 


i. (Е. — Ey)? + (An — Aw)? 
2: + (Pa – Р.)*. 
If Du» was greater than or equal to 4.0 then Dms’ 
was set equal to 4.5; otherwise Daw’ was the 
same as D, . To obtain the final score which ap- 
pears here (and which increases as motivation 
word association increases) Da.’ was subtracted 
from 4.5. For words outside the motive region, 
this score is always zero; for words within the re- 
gion tke score varies from 0.5 to 4.5. The actual 
profiles (Em, Am, Pm) used in calculations were 

the following: 


n Aff: 3.12, 1.11, —3.75 
n Ach: 1.97, 3.56. 2.90 


High positive scores indicate high motivation 
content. 

Users of the dictionary may find the ID num- 
bers of aid in drawing random samples from the 
total set of words. 


Aci? Рот Polr n Ag nAch ID 
0.44 0.93 1.78 0 0.77 000 
0.31 —0.16 1.78 0 0 001 
—0.68 0.00 0.69 0 0 002 
—0.10 —0.67 1.05 0 0 003 
0.84 0.20 1.16 0 0 004 
—0.40 018 0.57 0 0 005 
0.55 0.45 0.85 0 0 006 
0.13 -0.29 0.47 0 0 007 
1.43 0.23 150 0 0 008 
—0.30 0.13 0.35 0 0 009 
0.72 1.49 1.75 0 1.08 010 
—0.16 0.21 0.36 0 0 011 
—0.31 —0.800 1.02 0 0 012 
0.62 1.14 148 0 0.85 013 
0.30 —1.53 [588 10:77 0 014 
—0.66 —0.85 1.27....0; 0 015 
0.06 —0.41 0.91 0. 0 016 
—0.12 —0.98 1.11 0.51 0 017 
—1.09 —0.17 1.28 0 0 018 
—0.90 —0.28 0.98 0 0 019 
0.98 —0.52 1.13 0 0 020 
0.88 0.0 0.88 0. 0 021 
—0.70 —0.35 0.78 0. 0 022 
0.31 0.21 0.39 0 0 023 
—1.88 —1.26 2.9 0 0 024 
—0.88 —0.12 1.14 0 0 025 
—0.26 —0.344 0.74 0 0 026 
L0.65 038 0.95 0. 0 027 
0.41 —0.93 1.00 0. 0 028 
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Eral Acte Potn Polr n Af n Ach 
AMOUNT (It is that amount) ,0.33 —0.82 0.75 1.16 0. 0. 
ANOTHER (He has another) —0.24 —0.90 —0.51 1.07 0. 0. 
ANOTHER (That is another matter) —0.41 —0.82 —0.25 0.95 0. 0. 
ANSWER (He answered) 0.48 0.33 0.11 0.59 0. 0. 
ANSWER (He had the answer) 0.72 0.47 0.85 1.21 0. 0.58 
ANYTHING (He does anything) —0.67 0.36 —0.34 0.84 0. 0. 
APPEAR (It appeared to be that) —0.06 —0.23 0.59 0.63 0. 0. 
APPEAR (Then he appeared) 0.53 —0.65 —0.50 0.98 0. 0. 
APPOINT (They appointed him) 1.15 1.08 0.58 T8." 0. 1.01 
APPRENTICE (He is an apprentice) 0.50 —0.23 —0.03 0.55 0. 0. 
ARGUE (They argued) —2.85 1.32 —0.50 3.18 0. 0. 
ARGUMENT (They had an argument) —2.47 1.38 0.23 2.84 0. 0. 
ARISE (The idea arose) 0.63 0.88 —0.44 137 0. 0. 
ARM (It is his arm, not his leg) 0.05 0.92 0.84 1.25 0. 0.64 
ARM (They armed themselves) —1.10 1.89 1.21 2.50 0. 0.62 
ARMS (They brought arms with them) —1.81 2.04 0.92 2.88 0. 0. 
army (He is in the Army) —0.60 0.93 1.77 2.09 0. 0.65 
AROUND (They are all around) —0.93 0.99 CIEN. 0. 
ARRIVE (Then he arrived) 0.71 1.20 —0.83 1.62 0.71 0. 
ART (It is an art) » 0.76 —0.34 —1.61 1.82 1.01 0. 
ARTIST (He is an artist) 0.66 0.48 —1.37 1.60 1.03 0. 
ARTICLE (This is the article) 0.43 —0.37 —0.38 0.68 0. 0. 
ask (He asked about it) —0.08 0.39 0.34 0.52 0. 0. 
ASK (He asked for it) —0.27 —0.60 —0.24 0.75 0. 0. 
ATTACK (It was an attack) —2.27 2.36 0.94 3.41 0. 0. 
ATTEMPT (It was his own attempt) 0.05 0.95 0.82 1.25 0. 0.64 
ATTEMPT (He attempted it) —0.35 1.12 1.87 2.21 0. 0.97 
ATTENTION (He had their attention) —0.19 0.17 1.09 112 0. 0. 
away (He is away) —1.43 —0.47 —1.44 2.08 0. 0; 
AWAY (He did away with it) —1.75 0.03 —0.87 1.95 0. 0. 
BABY (It is her baby) 1.39 1.42 —3.20 3.76 2.66 0. 
BACK (He is back) 0.48 | —0.34  —0.39  Á 0.71 0. 0. 
BACK (It is in the back) —1.39 -—0.80  —0.29 1.63 0. 0. 
BAD (It was a bad idea) —3.35 —0.79 0.50 3.48 0. 0. 
BALL (He has the ball) 0.43 1.40 1.06 172807) 70! 1.26 
BANK (He is at the bank) 1.52 0.66 2:85 2.88 0. 1.51 
BANK (He is on the river bank) 0.95 —0.77 0.83 1.48 0. 0. 
BATTLE (They had a battle) —2.93 1.82 0.53 3.49 0. 0. 
BEAUTY (It had beauty) 2.21 —1.23 —2.40 3.49 1.65 0. 
BEAUTIFUL (It was beautiful) 0.81 —0.34 —2.83 2.97 1.62 0. 
BECOME (Then it became that) —0.59 0.62 —0.19 0.88 0. 0. 
BED (It is a bed) 1.37 —1.4 9 -0.63 2.06 0. 0. 
BEFORE (He was there before) —0.45 —0.52 —0.02 0.69 0. 0. 
BEGIN (Then it began) 0.25 1.26 —0.42 1.35 0. 0. 
BEGINNING (That was the beginning) 0.70 —0.33 —0:8- (017810 9, 0. 
BEHIND (He was behind) —1.08 —0.71 —0.29 1.28 0. 0. 
BELIEVE (He believed it) } 0.39 —0.42 —0.79 0.98 0. 0. 
BELONG (It belonged to him) 0.22 —0.88 0.03 0.91 0. 0. 
BEST (That is best) 1.31 1.01 0.71 1.80 0. 1.07 
BETTER (That is better) 0.54 —0.51 —0.522 0.91 0. 0. 
BIG (It is big) Х —0.94 0.77 0.67 1.39 0. 0. 
BIRD (It is a bird) 1.07 2.05 —2.17 3.17 1.74 0. 
BLACK (It is black) —1.80 —2.07 1.24 3.01 0. 0. 
BLOOD (That is blood) —1.11 1.33 —0.49 1.80 0. 0. 
BLOW (The wind blew) —1.09 1.21 -0:00. 163 VO! 0. 
BLUE (It is blue) 0.84 —0.91 —0.87 1.80 0. 0. 
BOAT (There is the boat) Р 1.08 0.67 0.84 1.53 0. 0.84 
BODY (It is of the body, not mind) 0.83 0.321  À —0.08 0.89 0 0 
Bopy (There was a body of them) —0.10 0.8 -063 108 0. 0. 
воок (That is the book) 1.07 -0.56 0:22 122 9 0 
BORN (He was born there) 101  —0.2 -171 215 099 0. 
вотн (He has both) 0.0 -046  —019 050 0. 90. 
вотн (It is both this and that) —0.40  —0.522  —0.10 0.66 0. 0. 
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* вох (It is in the box) 0.07 —1.38 0.62 1.52 0. 0. 002 
poy (The boy is there) 0.72 1.16 —0.57 1.448 0.52 0. 093 
BREAD (The bread is there) 1.14 —1.75 —0.20 2.10 0. 0. оол 
BREAK (He broke it) —2.44 0.73 —0.97 2.72 0. 0. 095 
BRIDGE (They are on the bridge) 0.9 | —1.34 180 2.4 0. 0. 096 
BRIGHT (It is bright) —0.11 0.50 —0.08 0.52 0. 0. 097 
BRING (He brought them) 0.50 —0.31 -0.59 0.83 0. 0. 098 
BROAD (It is broad) —0.72 —1.41 0.66 1.72 0. 0. 090 
BROAD (He has broad ideas) 0.65 0.49 0.57 0.99 0. 0. 100 
BROTHER (It is his brother) 1.41 1.13 —0.14 1.81 0.51 0.56 101 
BUILD (He built it) 1.03 1.54 1.45 2.35 0. 1.84 102 
BUILDING (It is that building) 0.80 —1.42 2.40 2.93 0. 0. 103 
BuRN (It burned) —2.47 0.74 —0.52 2.63 0. 0. 104 
BUSINESS (He is there on business) —0.06 0.60 0.60 0.85 0. 0. 105 
BUSINESS (He has his own business) 0.20 0.90 0.97 1.34 0. 0.76 106 

' BUY (He bought it) 0.69 0.20 0.68 0.9 0. 0. 107 
CALL (He called them) —0.25 0.00  —0.4 1.08 0. 0. 108 
can (He can do it) 0.40 0.35 0.80 0.96 0. 0. 109 
car (He has a car) à 0.24 0.49 1.04 1.18 0. 0.51 110 
CARE (He is in their care) 1.24 —0.9 -0.97 1.81 0.59 0. 111 
CARRY (He carried it) —0.16 0.05 0.82 0.88 0. 0. 112 
CARRY (He carried on) —0.38 1.20 —0.62 1.40 0. 0. 113 
catcH (Then they caught him) —0.80 0.51 0.86 1.28 0. 0. 114 
CAUSE (That caused it) —2.06 0.16 0.58 2.14 0. 0. 115 
CAUSE (That was the cause of it) —1.23 —0.60 0.28 1.40 0. 0. 116 
CENTER (It is in the center) —0.08  —1.39 0.00 1.39 0. 0. 117 
CERTAIN (It is at a certain time) —0.02  —0.83  -—0.33 0.89 0. 0. 118 
CERTAINLY (He certainly does) —0.28 0.37 -0.28 0.54 0. 0. 119 
CHANCE (There is a chance of it) —0.02 0.21 ^ —0.49 0.53 0. 0. 120 
CHANGE (It is a change) 0.29 0.38 0.56 0.74 0. 0. 121 
CHANGE (He changed) —0.22 —0.11 0.61 0.60 0. 0. 122 
CHARACTER (It is of this character) 0.83 0.77 0.39 1.20 0. 0.57 123 
CHARACTER (He has character) 1.59 1.88 —0.09 2.24 0.50 0.89 124 
CHIEF (That is the chief thing) 0.16 0.72 1:62 — 1.789590. 0.90 125 
CHIEF (He is their chief) —0.62 1.31 1.21 1.89 0. 0.67 126 
CHILD, CHILDREN (It is а child) 1.54 2.32 —2.99 4.04 эре 4 is 
CHILD, CHILDREN (It is his child) 1.10 0.77 —2.44 2.78 2.07 р : 
cHoosE (He chose that) 0.09 0.08 —0.1 0.69 rs 0. m 
CHURCH (He is at the church) 2.01 —1.05 0.11 2.27 dm 4 in 
CHURCH (He is at church) 2.40 —0.48 —0.77 2.57 б Г 

is i i = 1.08 0.52 1.4 0. 0. 132 
сттү (He is in the city) 0.75 029 0 0 133 

/ CLASS (It is in that class) —0.14 —0.23 0.11 0.59 0. 0. 134 
CLASS (There are social classes) 0.10 0.39 TA vio б oy us 
CLASS (He is at class) 0.32 0.02 1. Yom o ed 
CLASSROOM (He is in the classroom) 0.29  —1.04 0.92 ds » A qs 
CLEAR (It is clear) 0.76 —0.23 00 oen i BS 
CLOSE (They are close) —0.8 —034 — `0 097 0. 0. 139 
CLOSE (He closed it) —0.93 —0.27 у a Suh б, 4 
CLOUD (It is a cloud) 0.19 | —0.4 014 189 0. 0.722 141 
CLUB (He is in the club) 0.62 1.78 EU 258 0. 0. 142 
AE e 3 ES 199 250.» 1189 M5 
COLLEGE (He is in college) 1.07 x" ES. P eb 0. "eh 
COLOR (That is the color) 1.09 ad E 0.48 б. 0; 145 
COME (He came to the thing) 0.04 x id 0.68 0:73 0. 0. 146 
come (It came to be) 0.09 ea SEU TUE ЕУ 0. 147 
COME (It came from that) 0.16 —0. 0.97 0.50 bi 0. 148 
CoMMrTTEE (He is on the committee) —0.35 A Gib recta. MÒ: (90/1140 
COMPANY (The company employs 50) 0.66 018  —0.00 021 0. 0. 150 
COMPLETE (It is complete) —0.08 —017 0.97 073 0. 0. 151 
COMPLETE (He completed it) 0.66 nat Чан LB TANS 0. 159 
CONCERN (It concerned him) —0.79 ET m lieb 0 153 
CONDITIONS (Conditions are the same) —0.16 L0.80 051 123 0. 0. 154 
CONDITION (It is in that condition) 0:79 eos i ; 1 
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Eval Acte Poin Polr n Ag 
CONNECTION (That is the connection) —0.11 —0.7 0.56 0.98 0. 
constpER (He considered the idea) 0.49 0.14 —0.01 0.51 0. 
CONTAIN (It contained things) 0.20 —1.48 0.54 1.58 0. 
сохтехт (He is content) 1.28 —1.02 —1.40 2.15 0.84 
coNTINUE (It continued) —0.85 -1.H —0.20 1.41 0. 
coxTROL (He has control of it) 0.89 0.75 1.69 2.00 0. 
cost (That is its cost) —1.00 —0.40 0.62 1.24 On 
country (He is in the country) 1.16 0.08 | —0.74 1.388 0.77 
country (He did it for his country) 1.33 1.37 0.21 21.92 0. 
course (Of course he is) —0.47 0.25 0.44 0.69 0. 
course (It is the course of things) —0.02 0.78 —0.01 0.78 0. 
covnT (He із in court) —1.31 0.85 0.95 1.82 0. 
cover (He covered it) 0.49 —0.80 0.50 1.06 0. 
cross (He crossed the street) —0.05 0.46 —0.50 0.68 0. 
crown (There is a crowd) —1.80 1.16 —0.19 2.15 0. 
ery (He cried out) —1.30 0.19 —1.61 2.08 0. 
custom (It is their custom) —0.05 0.60 —0.61 0.85 0. 
cur (He cut it) —1.88 0.95 —0.6 2.21 0. 
DANGER (He is in danger) —2.75 1.49 0.53 3.17 0. 
DANGEROUS (That is dangerous) —2.43 1.86 1.04 3.23 0. 
DAUGHTER (It is his daughter) 1.68 —0.18 —2.46 2.98 2.18 
pay (It was so in those days) 0.20 —0.37 —0.21 0.47 0. 
рлү (He did it on that day) 0.93 0.28 0.00 0.98 0. 
pay (He did it during the day) 0.07 —0.56 —0.18 0.59 0. 
DAYDREAM (He daydreamed) 0.42 —1.76 —1.75 2:51 0. 
DEAD (He is dead) —1.77 —4.17 —0.64 4.58 0. 
DEATH (It is about death) —2.76 —2.29 0.10 3.59 0. 
pest (He has debts) —3.08 —0.39 0.01 3.11 0. 
DECIDE (He decided to do it) 0.42 0.71 0.59 1.01 0. 
DECISION (It is his decision) —0.46 —0.16 0.40 0.63 0. 
DEEP (It is deep) —1.30 —1.37 1.38 2.34 0. 
DEGREE (It is to that degree) —0.57 —0.17 0.90 1.08 0. 
DEMAND (There is a demand for it) —0.15 0.91 0.87 1.27 0. 
DEMAND (He demanded it) —1.41 1.80 0.67 2.38 0. 
DEPARTMENT (He is in the department) 0.15 0.25 0:60. 10:07000. 
DESCRIBE (He described it) —0.06 —0.59 —1.35 1.47 0. 
DESIRE (He has a desire for it) 0.65 1.55 —1.96 2.59 1.42 
DESIRE (He desired it) 0.43 0.58  —1.28 147 0.81 
DESTROY (He destroyed it) —1.89 1.50 0.07 2.42 0. 
DEVELOP (He developed the idea) 0.35 0.64 1.44 1.62 0. 
DEVELOPMENT (It is in development) 1.10 0.68 0.26 1.32 0. 
DIE (He died) —1.54 —2.54  —0.82 3.08 0. 
DIFFERENCE (That is the difference) —0.10 —0.28 0.34 0.45 0. 
DIFFERENT (That one is different) —0.76 —1.28 0.49 1.57 0. 
DIFFICULT (It is difficult) —2.11 0.13 0.88 2.30 0. 
DIFFICULTY (That is a difficulty) —1.97 —0.68 —0.28 2.10 0. 
DIRECT (He is a direct person) 0.12 1.40 0.76 1.60 0. 
DIRECTION (It is in that direction) —0.14 —0.66 0.22 0.71 0. 
DISCOVER (He discovered it) 0.53 1.03 0.65 1.88 0. 
DISCOVERY (It is his discovery) 0.90 1.47 0.39 ТЕКО: 
piscuss (They discussed it) 0.17 0.82 0.15 0.85 0. 
DISEASE (It is a disease) —3.46 0.70 0.15 34i 0; 
DISTANCE (That is the distance) —0;42 0:18 LOG Us)! 0; 
ро (He did it) —0.72 0.39 0.15 0.84 0. 
ро (He did without it) —0.731 —0.91 0:54 ^ 4:97 «00; 
ростов (The doctor is there) 0.88 0.73 —0.66 1.32 0.67 
noa (The dog is there) 1.30 0.93 —0.90 1.84 1.12 
DOLLAR (There are 50 dollars) 0.30 0.59 1.07 1.26 0 
poor (The door is locked) =0.42 ==1.87 1.19 2.01. 0. 
DOORWAY (He is in the doorway) 0.09 —1.63 1.38 2.14 0 
DOUBT (There is doubt about it) 1:19 йт —100 иво: 
pown (It is down there) -1.M  —0.38 -0.06 11 0. 
pown (He is down and out) -1.60  —239  —0.18 2.80 0. 
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praw (He drew it behind him) 
DREAM (He dreamed) 

prink (He drank to them) 

prive (He drove them hard) 
DRIVE (He drove there) 

DUE (This is due to that) 

puty (He is duty-bound) 

puty (Не is on duty) 

EAR (It is his ear) 

EARLY (It is early) 

EASY (It is easy) 

EASILY (He did it easily) 

EAT (He ate) 

EFFECT (It has an effect) 

ЕРЕОНТ (He did it with effort) 
EGG (It is an egg) 

ELECT (They elected him) 
ELECTION (There was an election) 
ELECTRIC (It is electric) 

ELSE (He has nothing else) 
EMPIRE (He has his own empire) 
EMPLOY (He employed the idea) 
END (That is the end) 

END (He ended that) 

ENEMY (He has enemies) 

ENJOY (He enjoyed it) 

ENLIST (He enlisted) 

ENOUGH (He has enough) 
ENOUGH (It is smooth enough) 
ENTER (He entered on the stage) 
ENTIRE (He has the entire thing) 
ENTIRELY (He did it entirely) 
ESCAPE (He escaped) 

EVEN (Even he does) 

EVENING (It is evening) 

EVENT (It is an event in time) 
EVER (Was he ever there) 

EVER (It will be for ever and ever) 
EVERY (This is for every person) 
EVERYTHING (He has everything) 
EVERYWHERE (They are everywhere) 
EXAMPLE (It is an example) 
EXCELLENT (It is excellent) 
EXIST (They existed for some time) 
EXPECT (He expected it) 
EXPERIENCE (He experienced it) 
EXPERIMENT (It is an experiment) 
EXPLAIN (He explained it) 
EXPRESS (He expressed the idea) 
EXTENT (That is the extent of it) 
EYE (That is his eye) 

FACE (That is his face) 

FACT (It is a fact) 

FACTORY (He is at the factory) 
FAIL (He failed) 

FAILURE (It is a failure) 

FAIR (Tt is fair in quality) 

FALL (It fell) 

FALL He fell) 

FAMILY (He has a family) 
FAMOUS (He is famous) 

FAR (It got this far) 

FAR (That is far more) 


з 
Ё 


ЕЁ: FI 
oOcÍcocrooooooocooocooocoeooo-cooo-ooooccoo-oococo-occ 


IS 


[e C d LJ SA E РҮ" S| 


o»»oooooooooo 


| 


veesgebpsehachngpgerzstute5auNERcuEPU5PSBEOSSEREPEHZERELEERSRSENSSESUEREERE 


Erde i 


ЕУ 
= 
3 


conr DTN AN ASE SAA SEES OD Se OO СӨ СӘ Ө CO DROS 
SSSVREVLESISERLSSLSVALASSSGSSSSRERALARSSE 


l 


|1 | | 


| 1 I 
гоооо 
= 
TENS 


| 
sopopponooo 


оооноонн н 


BSSRVRSBSESSSRHSESRSE 


1 


11 


| l 


| 


l 


eesgESBEESBSRESERSBRBRSSRHEJGEBSBEBRBSESERSEIJS5ESTESBRELE 


EFERLTEEECLI-T E 


& 


ооо ююкн оюн онон оок О юк кн SOP О о О О а М О а Ко WEN OR NEE О Оо О О е MOON MOONE I 
EDUPPDPEPEPELEDEPPEDRIIEIDPLPEEITEDEDEPEIITIETEPELEITIELITII 


за 
о 


SEONG s 


à 


з 
oooooooooooooooooooo-ooooooooooooooooooooooococooooooooocoocococo: 


2 
v 


g 


E 


ЕУ 


چن 
Co‏ 


à 
T 
- 


& 


л 
= 


dz 


2 
з 


FAR (It is far off) 

FARM (He is on the farm) 
FARMER (He is a farmer) 

FAST (It is fast) 

FATHER (It is his father) 

ravor (He favored it) 

rear (He has fear of it) 

rear (He feared it) 
FEEL (He felt it) 

FELLOW (It is that fellow) 

FEW (There аге a few) 

riELD (He is on the field) 

riGHT (He fought) 

riGnT (It is a fight) 

FIGURE (He saw a figure there) 
кил, (He filled it) 

FULL (It is full) 

FINALLY (He finally did it) 

rinp (He found it) 

FIND (They found him guilty) 
FINE (He is a fine person) 

FIRE (It is a fire) 

FIRST (That is first) 

rix (He fixed it up) 

rix (He fixed his eyes on it) 
FLOW (It flowed) 

FLOWER (There is the flower) 
FOLLOW (It follows from that idea) 
FOLLOW (They followed him home) 
FOLLOW (They followed their chief) 
FOOTBALL (It is about football) 
FORCE (His ideas have force) 
FORCE (He is in the force) 

FORCE (He forced it) 

FOREIGN (It is foreign) 

FORGET (He forgot) 

FORM (It has this form) 

ковм (He formed this from that) 
FORMER (That was in former times) 
FRATERNITY (He is in a fraternity) 
FREE (He is a free person) 
FREEDOM (He has freedom) 

FRESH (That is fresh) 

FRIEND (It is his friend) 
FRIENDLY (He is friendly) 

FRONT (It is in front) 

FULL (He has the full set) 
FUTURE (It is about the future) 
GAIN (He gained from it) 

GAIN (That is a gain) 

GAME (He is at the game) 

GARDEN (He is in the garden) 
GATHER (He gathered them) 
GENERAL (It is a general election) 
GENERAL (That is the general idea) 
GENERAL (There is the general) 
GENTLEMAN (He is a gentleman) 
cet (He got done) 

GET (He got the things) 

вет (He got it done) 

сет (He got off the bus) 

GIRL (There is the girl) 

GIRL (There is his girl) 
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Semantic PROFILES ғов 1,000 Wors 21 
Esel Acts Poin Polr "47 "Ad ID 
ave (He gave it to them) 0.86 0.36 -1.]0 1.44 OM 0. 344 
aive (He gave а speech) 1.16 1.85 —0.36 1.80 0.58 0. 345 
cive (He gave in to them) -0.00 -1.07 -1.76 2.06 0. 0. 346 
GLAD (He is glad about it) 1.19 0.29 —2.39 2.68 2.0 0. 347 
во (He went on—goes on) —0.41 0.77 —0.21 0.9 0. 0. 348 
во (He went to them—goes to them) —0.11 0.60 —0.69 0.92 0. 0. 349 
60 (It went, sour—goes sour) —1.37 —0.39 1.27 1.91 0. 0. 350 
вор (It is about God) 2.35 0.73 —1.00 2.4 2.19 0. 351 
воор (It is a good taste) 1.57 —0.36 —0.71 1.22-..,0:08. МЕ. 352 
соор (He is a good person) 2.05 0.86 —1.41 2.63 1.92 0. 353 
соор (It is a good job) 1.47 —0.25 —0.57 1.00 0.67 0. 354 
соор (It is for his own good) 0.14 —0.51 1.41 1.51 0. 0. 355 
соор (He has the goods) —0.09 —1.11 —0.29 1268 .:9, 0. 356 
GOVERNMENT (Не is in the government) 0.56 1.30 1.20 1,90 (00. 1.37 357 
GOVERNMENT (It is about government) —0.02 1.31 111 1.72 0. 1.00 358 
GREAT (It is for great persons) 0.92 0.60 0.19 1.12 O0. 0. 359 
GREAT (They come in great numbers) —0.50 1.16 0.80 1.53 0. 0. 360 
GREAT (He had a great time) 1.55 2.41 —-1.85 3.42 1.72 0. 361 
GREEN (It has green in it) 0.84 —1.11 —0.92 1.67 0. 0. 362 
GREY—GRAY (It has grey in it) —0.12 —1.60 OW Aleit 0. 363 
GROUND (It is on the ground) —0.05 —0.78 0.73 1.07 0. 0. 364 
Group (He is in that group) —0.73 1.12 0.20 1.85 0. 0. 365 
Grow (Their numbers grew) —0.33 0.69 0.42 0.88 0. 0. 366 
crow (He grew up) 0.79 1.68  —0.19 1.87 0. 0.69 367 
HAND (That is his hand) 0.51 0.46 0.00 0.93 0. 0. 368 
HAND (It is out of hand) —2.13 1.42 —0.40 2.59 0. 0. 369 
HANG (He hung it there) —0.91 —1.44 1.54 2.30 0. 0. 
HAPPEN (Then it happened) —0.53 —0.23  —0.49 0.76 0. 0. 
HAPPY (He is happy) 1.64 1.79  Á-229 8.84 02.322 . 0. 
HARD (He tries hard) 0.64 1.76 1.47 2.38 0. 1.84 
HARD (That is hard to do) —1.69 0.13 1.81 2.48 :0. 0. 
HARD (That is hard as rock) —0.90]  —1.78 3.77 4.27 0. 0. 
HARD (He is a hard person) —1.56 0.20 2.45 2.91 0. 0. 
HARDLY (There is hardly time) —1.46 —0.68 0.05 161 0. 0. 
HATE (He hated them) —3.11 0.11 —0.61 3.17 0. 0. 
HAS, HAVE (He had it) —0.17 —0.36 0.49 06 0. 0. 
HAS, HAVE (He had to do it) —0.37  —0.00  Á—0.48 0.61 0. 0. 
НАЗ, HAVE (Не had it done) —0.25 0.70 0.38 0.84 0. 0. 
HEAD (That is his head) 0.13 0.36 0.87 0.95 0. 0. 
HEALTH (It is about health) 1.29 0.70 0.07 147 0. к 
HEAR (He heard it) —0.07 0.37  —0.45 0.59 0. 0: 
HEAR (They heard his idea) 0.36 —0.01 0.00 0.36 0. 0. 
HEAR (He heard from them) 0.19 0.12  —1.16 1.18 би. sh 
HEART (The idea is from his heart) 1.53 1.00 —2.67 3.26 $ о 
HEAT (It is about the heat) —0.52 —0.12 0.17 hr » p 
Heavy (That thing is heavy) —1.68  —1.40 1.85 : 132 0. 
HELP (He helped them) 1.08 024 —1.48 1.84 e e 
HERE (He is here) 0.95 —0.538  —0.22 0.62 > a 
mme (He hid it) —1.59 0.22 —1.19 2.00 » 0. 
HIDE (He hid there) —1.97 —0.90 | —0.94 d Y iN 
HIGH (It is a high number) И go, 
HIGH (It is up high) —0.68 0.51 per c N 
нил, (There is the hill) Q2 abi? — DE VE Q 03 
History (That is its history) 0.27 0.77 0.6 oe 2 n 
HOLD (He held on to it) ERU. EE. OX cA n 1.17 
HOLD (He held his own) 0.31 1.04 ie 205 iino) 
HOME (He is at his home) 1.48 QM. Z1 3.00 2.74 0. j 
HOME (He went home) 2.12 1.18 = 173 0.75 0. 
Que (e hoped rope Lan LON oL vi 0 
PE (He has hope 3 4 Я 
; HOSPITAL (there ta hospital) EXE HE € TE * à 
Hor (That thing is hot) dm 0.63 077; 1001-20: 0. 


HOUR (He did it in an hour) 


HOUSE (They аге at his house) 
House (It is a business house) 
нож (That is how it is done) 
HUMAN (It is about humans) 
HUSBAND (There is her husband) 
nurt (He hurt himself) 

1DEA (It is his idea) 

IMAGINE (He imagined it) 
IMPORTANT (It is important) 
IMPORTANCE (It has importance) 
IMPOSSIBLE (It is impossible) 
INCLUDE (It included that) 
INCREASE (The number increased) 
INCREASE (It is an increase of 50) 
INDEED (It is indeed) 
INDEPENDENT (He is independent) 
iNDICATE (He indicated that) 
INDUSTRY (It is an industry) 
INFLUENCE (He influenced them) 
INFLUENCE (He has influence) 
INFORMATION (He has information) 
INSTRUCTOR (He is an instructor) 
INTEREST (It is an interest of his) 
INTEREST (It interested him) 
INTRODUCE (He introduced it) 
INVENTION (It is an invention) 
INVENTOR (He is an inventor) 
INVITE (He invited them) 

IRON (It is iron) 

зов (He has a job) 

зов (He did the job) 

Jorn (He joined in) 

soy (That is joy) 

JUDGE (He is a judge) 

JusT (Just then he did it) 

sust (He was just there) 

KEEP (He kept on doing it) 

KEEP (He kept it for them) 

кил, (He killed them) 

KIND (It is that kind) 

Kiss (They kissed) 

KNOW (He knew about it) 

кмоу (He knew them) 
KNOWLEDGE (He has knowledge) 
LADY (That is the lady) 

LAKE (That is the lake) 

LAND (It is his native land) 

LAND (It is on land, not sea) 
LAND (He owns this land) 

LARGE (It is large) 

LAST (It is the last time) 

LAST (At last it is done) 

LATE (He is late) 

LATE (It is late) 

LATE (It is about the late Mr. X) 
LATER (He did it later) 

LAUGH (He laughed) 

LAW (It is а law) 

LEAD (He led them there) 
LEADER (He is the leader) 

LEARN (He learned it) 

LEARN (He learned of it) 
LEARNING (He has learning) 
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teave (He left) 

LEAVE (He left it) 

vert (It is on his left) 

LENGTH (It is a length of time) 

tess (There are less of them) 

LET (He let them do it) 

LETTER (It is a letter) 

ue (He lay on the bed) 

ure (It lay there) 

Lire (It is his own life) 

ure (He is the life of the party) 

LIFE (Things do not have life) 

urt (He lifted it) 

монт (It is in the light) 

LIKE (He does it like this) 

LIKE (He liked it) 

LIKELY (It is likely) 

uit (He limited himself) 

LIMIT (There is a limit) 

LINE (It is in the line of sight) 

LINE (He read this line) 

LISTEN (He listened to it) 

LITERATURE (It is literature) 

LITTLE (It is little) 

LITTLE (He has little) 

LITTLE (It is little known) 

LIVE (He lived to be 50) 

LIVE (He lived there) 

LOCAL (It is a local street) 

LONG (It was a long time) 

LONG (He did as long as they did) 

LoNa (That was long before) 

LOOK (He looked at them) 

LOOK (It looked like that) 

LOSE (He lost it) 

Loss (It is a loss) 

LOVE (He is in love) 

LOVE (He loved) 

Low (The land is low) 

Low (It is a low number) 

MACHINE (It is a machine) 

MACHINERY (It is machinery) 

MAIN (That is the main thing) 

MAKE (He made ready) 

MAKE (He made that) 

MAKE (He made it do) 

MAN (That is the man) 

MAN (He is a man now) 

MANUFACTURE (They manufactured 
them) 

MANY (There are many) 

MARK (He marked it there) 

MARKET (There is a market for it) 

MARRY (He married her) 

MATERIAL (He has the materials) 

MATTER (He looked into the matter) 

MAY, MIGHT (He may be there) 

MAY, Miau (He may as well do it) 

MAY, мІснт (Не may go if he wants) 

MEAN (He did it by that means) 

MEAN (He meant to do it) 

MEAN (It meant this) 

MEET (He met them) 
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MEETING (The meeting is there) 
MEMBER (He is a member) 
MENTION (He mentioned it) 
MERELY (It is merely that) 

METAL (It has metal in it) 

MIDDLE (It is in the middle) 

MILE (It is a mile to there) 

мик (He has the milk) 

мїхр (It is about the mind) 
minute (He did it in a minute) 
жетва (It missed) 

MISSING (It is missing) 

MODERN (It is modern) 

MOMENT (He was there for a moment) 
MONEY (He has the money) 

MONTH (It was in that month) 
MOON( The moon is up) 

MORE (The idea is more than that) 
MORE (He has more of these) 
MORNING (It is morning) 

MOTHER (It is his mother) 

MOTOR (The motor is there) 
MOUNTAIN (There is the mountain) 
MOUTH (That is his mouth) 

MOVE (It moved) 

MOVEMENT (They are in that movement) 
MOVEMENT( There is movement there) 
MUCH (That is much of it) 

мосн (That is much more) 

music (It is music) 

MUST (He must do it) 

NAME (It is his name) 

NAME (He named it) 

NATION (He is from that nation) 
NATIONAL (It is national) 

NATIVE (He is a native of the state) 
NATURE (That is the nature of it) 
navy (He is in the Navy) 

NEAR (It is near that) 

NEARLY (He is nearly there) 
NECESSARY (It is necessary) 

NEED (He needed it) 

NEED (He need not have done it) 
NEED (There is a need for it) 
NEIGHBOR (It is his neighbor) 
NERVOUS (He is nervous) 

NEVER (He never does) 

NEW (It is new) 

NEWSPAPER (There is a newspaper) 
NEXT (That is next) 

NICE (It is nice) 

NIGHT (It is night) 

No (No, it is not) 

NONE (There are none) 

NOT (It is not) 

NOTE (He noted it) 

NOTHING (There is nothing) 
NOTICE (He noticed it) 

Now (He is there now) 

Now (Now it was done) 

NUMBER (There are a number of them) 
NUMEROUS (They are numerous) 
OBJECT (The object is there) 


Davi R. Herse 
Eval Acte 
—0.03 0.47 

1.14 1.51 
—0.07 —0.47 
—0.22 —1.25 

0.58 —1.82 
—1.14 —1.63 
—0.97 —0.42 

1.38 —0.96 

0.56 0.86 

0.38 0.70 
—1.82 0.09 
—2.12 0.11 

0.86 1.13 

0.06 0.30 
—0.24 0.94 

0.36 —0.10 

0.83 —1.71 
—0.14 —0.17 
—0.30 —0.88 

0.42 —0.41 

1.68 1.38 

0.15 0.81 

0.00 —1.13 
—0.65 0.68 
—0.74 0.67 
—0.65 1.57 
—0.10 0.82 
—1.09 —1.38 
—0.77 —0.32 

2.08 1.33 
—1.51 0.09 

0.43 0.23 

0.07 —0.60 

1.42 0.38 

0.21 0.28 

0.62 —0.11 

0.02 —1.06 

0.62 1.41 
—0.14 —1.34 
—0.34 —0.83 
—0.65 —0.24 

0.14 0.00 
—1.04 0.34 
—0.27 0.05 

0.40 —0.10 
—1.99 —0.53 
—1.03 —0.83 

0.89 —0.31 

0.78 0.64 

0.16 —0.04 

1.76 —1.28 

1.15 —1.05 
—2.09 0.27 
—1.82 —2.00 
—1.36 —0.57 
—0.05 —0.37 
—1.01 —2.00 

0.19 —0.42 

0.24 0.24 

0.11 —0.36 
—0.34 —0.28 
—0.56 0.16 

0.22 —1.28 
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Semantic Prorres ror 1,000 Woens 25 
Eval Acts Poin Рә! "Af sáà ID 
| opserve (He observed the things) .96 — —0.96 0.19 1.37 0. 0. 595 
occasion (It is an occasion) 1.00 0.40 —1.14 1.56 1.00 0. 506 
ОРЕ (It is off from the others) —0.53 —0.93 0.54 1.19 0. 0. 597 
orrer (He offered it to them) 1.00 —0.26 —0.43 1.12 3 0. 598 
orricE (He is in office) 0.19  —0.26 047 0.57 0. 0. 509 
orricE (That is his office) 0.79 0.66 0.61 1.20 0. 0.601 600 
orricer (He is the officer) 0.19 1.54 1.89 2.45 0. 1.62 601 
orrFICIAL (He is an official) —0.87 0.29 0.70 “kab. D. 0. 602 
- OFFICIAL (It is official) —0.40 0.75 108 З.Ш 0.69 603 
` оғтех (He often does) 0.31 0.80 —0.32 0.91 0. 0. 604 
- оп, (It is oil) 0.36 —0.35 2.13 2.19. „0. 0. 605 
о (He is old) —0.70 —2.91 —0.73 3.08 0. 0. 6% 
ошр (It is old) —0.53 —1.92 0.29 2.01 0. 0. 607 
ош› (He spoke of old times) 0.62 —0.78 1.27 1.0 0.51 0. 608 
ошрЕВ (He is older) 0.08 -1.27 0.34 1.88 ibs 0. 609 
ONCE (He did it once) 0.09 —1.0  -0.1 1.0 0. 0. 610 
ONCE (Once it was so) —0.18 —1.04 —1.00 1.45 0. 0. 611 
охе (He is one of them) —0.60 —1.44 —0.11 1.5; . 0. 0. 612 
охе (One is a number) 0.34 —0.93 1.00 1.42.9. 0. 613 
ONLY (There are only 50) —0.40 —0.48 0.388 0.73 0. 0. 614 
OPEN (It is open to them) 1.15 -0.27‹ 0.25 1.20 0. 0. 015 
OPINION (It is an opinion) —0.02  —0.08 0.27 0.28 0. 0. 616 
oppose (He opposed it) —1.04 1.05 1.35 2.0 0. 0. 617 
ORDER (It has order) —0.00 0.46 —0.43 0.63 0. 0. 618 
ORDER (That is an order) —1.44 0.52 1.34 2.08 0. 0. 619 
ORDER (He ordered them to do it) —1.20 1.29 1.29 2148 0. 0. 620 
ORIGINAL (It was the original idea) 1.09 1.05 0.08 1.51 0. 0.62 621 
OTHER (He has the other) -1.08 —1.05  —0.0 1.51 0. 0. 622 
OTHER (He has other ideas) —0.27 0.54  —0.18 0.63. 0. 0. 623 
опант (It ought to be) —0.57 -1.14  Á—0.0 1.28 0. 0. 624 
our (He is out in the street) —0.96 | —0.17 0.40 1.05 0. 0. 625 
our (That is out of bounds) —2.10 0.17 0.40. 2.14 0. 0. 626 
ойт (It is out now) —1.62 —1.05 0.81... 1:91: 20: 0. 627 
опт (They are out of them) —1.77 —0.65  —0.60 1.99 0. 0. 628 
OUTSIDE (He is outside) —0.14 0.46 —0.36 0.60 0. 0. 629 
OVER (It is over there) —0.43 .—0.37  —0.20 0.60 0. 0. 630 
OVER (It is over the others) —0.87 ‘—0.32  —0.05 0.93 0. 0. 631 
over (They are all over) —0.69  —0.69  —0.17 0.99 0. 0. 632 
OVER (It was all over nothing) —1.57 —0.36 0.10 1.61 0. 0. 633 
own (It is his own) 0.52 —0.12 0.45 0.70 0. р: е 
PAPER (There is some paper) 0.20  —1.18 1.00 1.59 am m са 
PARENTS (His parents are there) 1.34 —0.0 —1.41 1.95 i 3 [oH 
PART (That is part of it) —0.12 —0.02 0.4] 0.43 | A 4 
PART (He did his part) 0.46 —0.05  —0.83 0.95 р » 8 
PARTICULAR (It is a particular kind) —0.06 —0.89 0.37 0.97 2 m cio 
PARTY (He is in the party) 0.61 2.0 0.71 2.29 0. 0. 641 
pass (He passed by them) —0.72 0.94 0.14 1.19 Я RT 
PASS (He passed the test) 1.14 0.80 10.53 i 9 т " Е 
PASS (Не passed them in) 0.19  —0.33 0.07 AY d A Ei 
а is his past) I 25 E. 0.82 0. 0. 645 

PAY (He paid) —0. —0. . d s x 

PEACE (There was peace) 2.03 —1.82 Ei on mh 4 M: 
PEOPLE (There are the people) 0.37 1.56 т бю бй. 0. 648 
PEOPLE (It is for the people) —0.01 ب‎ m r1 " 0. hie 
PERHAPS (Perhaps it is) -0.79  —0.78 ET 2:5 0. 0. 650 

PERMANENT (It is permanent) 0.02 ;—1.99 0.76. 19% 0. 0. 651.. 
PERMIT (He permitted it) 006 $0.99 L073 078 0. 0. 652 
PERSON (There is the person) 0.19 0.19 —0.84 196 0. 0. 653 

PERSONAL (It is personal) 0.19  —1.76 6 ош в, Үй 664. 
PICTURE (That is his picture) 141  —1.79 ES 15 0. @, 655 
PIECE (That is a piece of it) jp ap "os 1117 0. 0. 656 
PLACE (Tt is a publie place) up 158 0.19 1.70 0. 0. 657 


PLACE (It is in its place) 
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piace (He placed it there) 
PLAN (That is the plan) 
rLAY (He played) 
puaY (He wrote a play) 
PLEASANT (It is pleasant) 
roer (He is a poet) 
pont (He pointed to it) 
роїхт (He is at that point) 
voiNT (He has a point) 
pourtics (He is in politics) 
poor (The poor thing is cold) 
roon (He is poor) 
POPULAR (It is a popular idea) 
POSITION (It is in position) 
possess (He possessed it) 
POSSIBLE (It is possible) 
power (It has the power) 
power (He is a power) 
PRACTICALLY (It is practically done) 
РВЕРАВЕ (He prepared for it) 
PRESENT (He does at present) 
PRESENT (He presented it to them) 
PRESENT (He is present) 
PRESENCE (It is the presence of it) 
PRESSURE (It is а pressure) 
PREVENT (He prevented it) 
price (That is the price) 
PRIVATE (It is private) 
PROBABLY (It probably is) 
PROBLEM (He has a problem) 
PRODUCE (They produced it) 
propuct (He sells the product) 
PROFESSION (It is his profession) 
PROFIT (There is a profit) 
PROGRESS (That is progress) 
PROPER (It is the proper thing) 
property (It is his property) 
protect (He protected it) 
PROVE (It proved to be so) 
PROVE (He proved it) 
PROVIDE (He provided it) 
PUBLIC (It is public) 
PUBLIC (It is for the public) 
PULL (He pulled it) 
PURPOSE (That is the purpose) 
put (He put it there) 
QUALITY (It is of quality) 
QUANTITY (There is a quantity of it) 
QUESTION (It is a question of time) 
QUESTION (That is his question) 
QUICK (It was quick) 
QUICKLY (He did it quickly) 
QUIET (It is quiet) 
quit (He quit) 
QUITE (That is quite so) 
RATE (That is the rate) 
RATHER (It is this rather than that) 
RATHER (That is rather often) 
REACH (Then it reached here) 
READ (He read of it) 
READY (He is ready) 
REAL (It is real) 
REALLY (It is this really) 
REALLY( That is really often) 
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mEALIZE (He realized that it was so) 


REASON (That is the reason) 
gECEIVE (He received it) 

RECENT (It is recent) 

mECENTLY (It was recently done) 
necorD (That is the record of it) 
RED (It is red) 

REDUCE (They reduced it) 
REFUSE (He refused) 

REGARD (He regarded it with care) 
RELATION (That is the relation) 
RELIGION (That is his religion) 
RELIGIOUS (He is religious) 
REMAIN (He remained there) 
REMAIN (He remained) 
REMEMBER (He remembered) 
mEPLY (He replied) 

REPORT (That is his report) 
REPORT (He reported on it) 
REPRESENT (It represented that) 
REST (Here are the rest) 

REST (He rested) 

RESULT (That is the result) 
RESULT (It resulted from that) 
RETURN (He returned) 


RETURN (It is the return of spring) 


RICH. (He is rich) 

RIDE (He rode on it) 

RIGHT (It is his right) 

RIGHT (That is the right one) 
RIGHT (It is on his right) 
RIGHT (He is in the right) 
RISE (It rose) 

RIVER (There is the river) 
ROAD (That is the road) 

ROCK (It is a rock) 

ROOM (That is his room) 

RUN (He ran to them) 

SAFE (He is safe now) 

SAILOR (He is a sailor) 

SAND (That is sand) 

SATISFY (He satisfied himself) 
SAVE (He saved them in time) 
SAY (He said it) 

SCARCE (These are scarce) 
SCENE (He was at the scene) 
SCHOOL (He is at school) 
SCIENCE (It is a science) 

SEA (There is the sea) 

SECOND (He did it a second time) 
SEE (He saw the thing) 

SEE (Не saw how to do it) 
SEE (He saw them about it) 
SEEM (It seemed to be that) 
SELL (He sold it) 

SEND (He sent it to them) 
SENSE (It is one of the senses) 
SERIOUS (He is a serious person) 
SERVE (It served for that) 
SERVICE (He did them a service) 
SERVICE (He is in the service) 
SET (He set the date) 

SET (He set it there) 
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Eval Аст Potn Polr n Af н Ach 
sET (He set it free) 0.20 0.08 —1.26 1.28 0.53 0. 
SETTLE (He settled for that) —0.47 —0.10 —0.31 0.57 0. 0. 
sETTLE (He settled in that town) 0.66 0.30 —0.45 0.86 0. 0. 
SEVERAL (There are several) —0.13 —0.28 —0.08 0.32 0. 0. 
зні” (There is the ship) 0.70 0.77 1.45 1.78 0. 1.10 
аноот (He shot at them) —2.46 0.30 0.04 2.48 0. 0. 
SHORE (It is along the shore) 0.50 1.01 0.10 1.13 0. 0. 
sHort (It is short) —0.91 —1.08 0.42 1.47 0. 0. 
вності (He should do it) 0.03 0.95 0.23 0.98 0. 0. 
sHOULD (It should be there) —0.31 —0.92 0.46 1.07 0. 0. 
SHOULDER (That is his shoulder) —0.12 0.26 0.68 0.74 0. 0. 
suow (He showed it to them) 0.37 0.85 —0.56 1.08 0. 0. 
srbe (It is on that side) —0.34 —0.41 0.55 0.76 0. 0. 
srpE (He is on their side) —0.69 0.42 —0.69 1.06 0. 0. 
SIGN (That is a sign of it) 0.01 —0.61 0.07 0.62 0. 0. 
sign (He signed) 0.08 —0.04 —0.55 0.56 0. 0. 
SILENCE (There was silence) 0.63 —2.74 —0.06 2.81 0. 0. 
SILENT (It was silent) 1.63 —3.41 0.21 3.79 0. 0. 
SILVER (It is silver) 1.23 —0.85 1.40 2.05 0. 0. 
SIMPLE (It is simple) 0.81 —0.43 —0.29 0.97 0. 0. 
зІМРІҮ (He simply does) —0.77 —1.58 —1.12 2.08 0. 0. 
siNG (He sang) 0.65 0.78 —2.38 2.59 1.66 0. 
SINGLE (Not a single thing is there) —1.73 —2.08 0.21 2.72 0. 0. 
SINGLE (He is single) —0.72 0.56 0.05 0.92 0. 0. 
SISTER (It is his sister) 1.11 0.48 —2.80 3.05 2.19 0. 
sit (He sat there) —0.15 —1.53 —0.10 1.54 0. 0. 
SITUATION (That is the situation) —1.40 —0.15 —0.59 1.53 0. 0. 
sky (It is in the sky) 0.48 —0.87 —0.90 1.34 0. 0. 
SLEEP (He slept) 1.67 —3.20 —0.67 3.67 0. 0. 
stow (It is slow) —0.66 —2.09 —0.17 2.19 0. 0. 
SMALL (It is small) 1.08 —0.91 —0.76 1.60 0. 0. 
80 (That is so often) —0.46 —0.44 —0.05 0.64 0. 0. 
80 (So it is) —0.26 | —0.90 0.48 1.05 0. 0. 
SOCIAL (It is a social matter) 0.55 0.31 —1.09 1.26 0.72 0. 
sort (It is soft) 1.49 —2.16 —1.38 2.97 0. 0. 
SOLDIER (He is a soldier) 0.43 1.13 1.44 1.88 0. 1.27 
SOLVE (He solved it) 0.59 1:25 E26 4 Smo 1.35 
some (There are some) C048 063 001 0.79040. — 0 
some (He has some things) 0:48 '-041 0.74 097 0. 0. 
SOMETHING (That is something) 0.0 —0.14 —0.66 0.68 0. 0. 
SOMETIMES (It is sometimes) —1.04  —1.23 —0.53 1.69 0. 0 
soMEWHAT (It is somewhat) 1:02 -0.77 -0.34 1.32 0. 0. 
SON (It is his son) 1.5 1.49  —2.0 2.89 2.00 0. 
soon (It soon will be) 0:08 10 006 OG оёт о. ^0. 
sorrow (He has sorrows) күй FO  -24 234970; 0. 
sorry (He is sorry) 0:13. КЕЕ; 0160. OIDE wid: 20: 
SORT (It is some sort of thing) 20.71 $59.05 087 1.08500. ^ 0. 
SOUL (It is about the soul) 0.68 0.2 -187 199 111 0. 
SOUND (It sounded the same) 0.46 ^" 0.121 -0.46 0.66 0. 0. 
SPACE (There is а space) Eos Mewes LSM Aegan O21 bu. O. 
SPEAK (He spoke of it) 0.500 —0.00 —0.27 057 0. 0. 
SPECIAL (It is special) 0.88 120.63 —068 127 0. 0. 
SPEECH (There was а speech) 0.02 90.0.80 018 11090» ' 0. 
sPEND (He spent time with them) 0.87 0.60 —0.52 1.18 0.53 0. 
spenn (He spent himself) =1.09 0.14 -0.79 1.85 0. 10. 
spIRIT (How are his spirits) 0.37 0.15  —0.0 045 0. 0. 
SPREAD (It spread) —0.86 —0.09 0.12 0. ; 0. 
SPRING (It is spring) 1.59 1.05 j icti t 
STAND (He stood) 0.15  —0.70 "m т 10 
STAND (It is to stand as it is) —0.87  —0.54 v 19 A nh 
STANDARD (This is the standard) 0.00 —0.48 0.87 da hi i 
STAR (The stars are out) 1.23 —1.83 1.40 $a 0. 0. 
8 + А. Li Ext 24 . . . 
TAR (He is a star) 1.55 1.93 -0.15 2.48 0. 1.01 
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TAKE (He took the prize) 891 


BIR SESSURESSRRSSSSNEESESRASS 


TAKE (He took it away) 


TAKE (He took that street) 


y Eral Ach Poin Pole "Af "4а ID 
| sraRT (He started to do it) —0.50 0.65 —0.07 0.82 0. 0. S48 
* rate (There are 50 states) 1.11 1.02 0.87 1.4 0. 1.13 849 
state (He is in a state of shock) —2.80  —1.79 -0.18 3.33 0. 0. 850 
srATE (He stated that it is so) —0.60 —0.00 0.38 0.7 0. 0. 851 
STATEMENT (It is his statement) —0.27 —0.18 0.30 0.4 0. 0. 852 
STATION (There is the station) 0.57 —0.26 0.41 0.75 0. 0. 853 
втАҮ (He stayed) —0.00 —0.81 —0.04 0.84 0. 0. 854 
STEEL (It is of steel) 0.37 —2.01 4.00 6.04 0. 0. 855 
srEP (It is a step forward) 0.83 0.97 0.38 1.33 0. 0.70 856 
ster (He stepped out) —0.23 0.76 —0.28 0.84 0. 0. 857 
srILL (Не is still there) —0.82 —1.02 —1.04 1.00 0. 0. 858 
STILL (There is still more) —1.03 —0.73 —0.200 1.28 0. 0. 859 
STILL (All was still) 1.24 —2.96 —0.05 3.21 0. 0. 860 
STONE (It is а stone) —0.50 —2.87 2.68 3.96 0. 0. 861 
Тор (It stopped) —1.19 —0.80 1.27 1.92 0. 0. 862 
story (That is the story) 0.83 —0.12  —1.18 1.45 0.85 0. 863 
| STRANGE (It is strange) —1.46  —1.00 0.235 1.78 0. 0. 864 
STREAM (There is а stream of them) 0.04 E —1.0 1.88 0. 0. 865 
STREET (This is the street) —0.31 0. 0.73 1.00 0. 0. 866 
STRONG (It is strong) —0.42 e 2.21 2.50 0. 0.99 867 
STUDENT (He is a student) 0.48 0. —0.20 0.58 0. 0. 868 
srUDY (He studied) 0.00 —0. 118 138" "Ө; 0. 869 
SUBJECT (That is the subject) —0.17 —0. 0.87 1.4 0. 0. 870 
SUCCEED (He succeeded) 1.12 0. 0.40 1.24 0. 0. 871 
success (He succeeded) 1.56 1. 0.76 2.27 0. 1.47 872 
SUCCESSFUL (Не was successful) 1.75 A. 0.00 263 0. 1.70 873 
such (He had such a time) 0.10 0. —0.25 0.30 0. 0. 874 
SUFFER (He suffered) —2.23  —0. —1.93 3.10 0. 0. 875 
suacest (He suggested it) 0.00 =0. —0.20 0.23 0. 0. 876 
. SUMMER (It is summer) 1.22 1. —1.59 2.37 1.00 0. 877 
SUN (There is the sun) 0.64 0. 0.52 0.83 0. 0: 878 
SUPPLY (He has а supply of them) 0.21 —0. 1.10 1.49 0. 0. 879 
SUPPLY (He supplied them) 0.35 0. 0.80 1.07 0. 0.54 880 
SUPPORT (He supported them) 1.32 1 —0.11 2.03 0. 0.82 881 
SUPPOSE (He supposed it to be во) —0.88  —0 —0. 0.07.0; 882 
SUPPOSE (He was supposed to do it) —0.48 —1 —0. 1:22. 10: 883 
SURE (He is sure) 0.64 0 0. 1.97 Ө: 65 884 
SURE (It is а sure thing) 0.38  —0 0. 0.70 0. e 
SURFACE (That is the surface of it) 0.00  —0 1 1270) 0; v 
SURPRISE (He surprised them) 0.52 M -1. 2.05 0.77 4 
„ SURROUND (They surrounded it) 2s б 0. EA i 889 

SWEET i Я - = . i 
) (They were sweet to him) 0.38 0 0. 0.58 0. 890 

2.31 0 0. 2.93 0. 

0.45 0. 0 1.15 0. 

0.24 0 0.93 0. 

1. 1.59 0. 

0. 1.16) 40: 

0. 1,220) 

2.47 0. 

0.83 0. 

1.15 (0y 

0.45 0. 

8.29 «0. 

0.0 0. 

0.40 0. 

1.01 0. 

1.60 0. 

0.92 0. 

0.82 0. 

0.75 0. 

0.97 0. 

0.98 0. 
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0. 

0. 

0. 

0. 

0. 

0. 

0. 

0. 

0. 

0. 

0. 

ы 0 

TAKE (He took hold of it) 4 à 
TAKE (He took interest in them) —0.01 0. t a aos 
"AXE (He took it down) USD c ж 0. 896 
TALK (He talked) 057 WEM = 0. 897 
TAX (It is a tax) TO, >> : 0. 808 
"EACH. (He taught it) 0.80 ч 2 0. 899 
TEACHER (He is а teacher) 0.28 9 a 0. 900 
TELL (He told them about it) EOD Weed ZA 0. 901 
TERRIBLE (That is terrible) —3.26 9. E 0. 902 
THEIR (It is theirs) 024 0: -0 0. 903 
THEN (Then he was done) ip vs 0 0. 904 
THEN (If this is so then so is that) 0:98 ЧО E 0. 905 
THING (The thing is there) sepe) e 0. 0. 90% 
THINK (He thought about things) CIS 70, 0. 907 
| THINK (He thought it so) 0.42 ш 0. 0. 908 
THINK (He thought it through) 0.07 Es 25 0. 909 
THINKING (He is thinking) e _0 —0.62 0. 910 


THOUGHT (It is a thought) 
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Esal Acts Poin = d A ch т ] 
= 56 ra А . . 
rHovGHT (He is deep in thought) 0 рю Io 1.32 0. 0. 912 
rugow (He threw it) awe —107 0:25 1.20 0. 0. 913 
THUS (It was thus) т aes 0.24 0.73 0.80 0. 0. 914 
rime (This is the time to do it) . 0.18 —0.62 0.64 0. 0. 915 
TIME (It was in those times) iiri £e Е On 074 0. 0. 916 
rime (It will be so in time) E 0.19 0.24 0.30 0. 0. 917 
TIME (He has time for it) ка E E 25 md 6 = 
TOGETHER (They are together) 1 ac E. 1.47. 0. 0. 919 
i —1.40 —0.09 0.44 1 

тоо (That is too often) $8 0.12 0.42 0.58 0. 0. 920 
TOTAL (That is the total number) a Mem acm 104 0.00 n a 
TOwN (He is in town) EA EE 1.35 1.80 0. 120 92 
TRAIN (He trained them) vem 0.05 237 2 58 0. 0. 923 
TREE (It is a tree) 2.25 0.86 10.0% 9.88 A 0. 94 
TROUBLE (He has troubles) E 011 —0.54 2.48 0. 0. 925 
TROUBLE (It troubled him) ау cp 0:25 0.79 "y б. 090 
тй M iron 30 137 060 223 10 0 т 
TRUE (Не is а true person) 1.80 ed ee Е a 0. oi 
TRUTH( It is the truth) 0.29 HE Todos wt б. m 
TRY (He tried to do it) 0.03 Й x^ e M a о, 
TURN (He turned around) 0.13 0.09 ^ eed А ^t m 
TURN (He turned it around) —0.18 —0.04 Ex "kn E б. “© 
TURN (It turned into that) —0.91 zd Es 135 a з к= 
ТҮРЕ (It is a type) —0.16 — r Б 20 ри à D a 
UNCLE (It is his uncle) 1.36 0. p ra ^ 0 ке 
UNDERSTAND (He understood) 0.95 —0.97 ea +8 b ^3 s 
x чыгын cepa по L4 039 123 0. оз 0j 
UNIVERSITY (He is at a university) 0.99 1.41 on 1 E я С E 
up (It is up there) —0.82 —0.77 E Ta A "s К 
up (Time is up) —0.49 —0.64 RE^ oa а ИЕ - 
vP (He spoke up) 0.17 0.59 © eA 4 M 
vsE (He used that one) —0.27 0.50 —0. a ^ 9 a 
vsE (He used it up) —1.17 —0.57 —0.55 E^ A 96 
vsE (He used to do it) ape Eg d do m "T о 
USE (It is in use) —0. 1 4 3 b А 
ае (That is the usual thing) —0.49 —1.18 —0.02 1.28 0. x Ha 
USUALLY (He usually does) —0.22 —0.93 —0.41 1.04 3 Y oi 
VALUE (It has value) 0.77 —0.37 0.34 0.92 z 4 od 
VARIOUS (There are various things) 0.42 —0.2b  —0.36 0.61. 0. " "m 
vERY (That is very often) —0.09 0.39 0.04 0.40 0. y 050 
victory ( It is a victory) 1.06 1.35 1.30 2.15 0. А 961 
VIEW (That is his view) 0.22 —0.25 —0.18 0.38 0. с 952 
visit (He visited them) 0.71 —0.14 —1.18 1.35 0.73 i 0983 
VOICE (It is a voice) 1.08 —0.02 —1.77 2.08 1.45 "1 954 
VOTE (They voted on it) 0.86 —0.01 2502. 8030100. "a 955 
voTE (They took a vote) 0.64 0.39 1.35 1.55 0. 0. 086 
wait (He waited) —0.39 —1.29 0.56 1.46 0. m 087 
WALK (He walked) 0.45 1.24 0.13 1.32 0. А 958 
WALL (It is on the wall) 0.24  —1.99 0.48 2.05 0. 0. 080 
WANT (Не wanted that) —0.43  —0.00  —0.19 0.47 0. 0. 960 
war (It is war) —3.96 2.22  À—0.04 4.54 0. 0. 901 
WARN (He warned them) —0.64 1.88 0.07 1.99 0. 0. 962 
warcH (He watched) —0.07 -0.90 -0.75 117 0. 0. 963 
WATER (There is water) 0.66 1.18  —0.30 1.39 0. 0. 964 
way (That is the way to do it) 0.02 0.44 0.84 0.95 0. 0. 965 
way (It is along the way) —0.61 —0.15 0.15 0.65 0. 0. 966 
WEALTH (He has wealth) 0.35 1.08 = 0:24 1.18 0. 0. 967 
WEAR (He wore that) —0.01 0.51  —0.44 0.68 0. 0. 968 
WEEK (This is the week for it) —0.24 0.78 0515 — 083.1 «0. 0. 969 
WELL (He is well along) 0.57 0.68 -0.21 0.91 0. 0. 970 
WELL (He is doing well) 1.18 0.00 003 1.18 0. 0. E 
WHEN (He will when they are done) —0.84 —0.61 0.12 1.4 0. 0. om 
WHITE (It is white) 1.139: GELS 041 51477, 0; 0. 078 
WHOLE (That is the whole thing) 0.66 =] 33 0.53 1.58 0. 0. 
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Eval Ace Poin Pole «Аў "Ac ID 
wipe (It is wide) —0.54 — —1.21 0.30 1.36 0. 0. 974 
wire (It is his wife) 0.97  —0.32 -3.12 328 1.85 0. 975 
жи (It was wild talk) —1.94 1.42 5607. $41 6: 0. 976 
wiN (He won it) 1.28 2.20 0.43 2.58 0. 1.50 977 
wiND (It is the wind) —0.83 0.69 0.57 1.22 0. 0. 978 
wixpow (There is the window) 0.65 —2.69 077 2.8 0. 0. 979 
WINTER (It is winter) —2.21 1.06 1.36 2.80 0. 0. 980 
wise (He is wise) 1.49 0.88 1.2% 2.4 0. 1.32 981 
wish (He wished for it) 1.14 —0.75 —1.25 1.85 0.81 0. 982 
woman (That is the woman) 1.65 0.73 —2.69 3.24 2.65 0. 983 
woNpER (He wondered about it) —0.13 —0.82 —0.94 1.200 0. 0. O84 
WONDERFUL (It is wonderful) 1.42 1.00 -1.73 2.45 1.86 0. 985 
жовр (He had words for it) —0.06 | —0.33 -0.56 0.65 0. 0. 986 
work (What is his work) 0.53 1.19 1.38 1.90 0. 1.34 987 
work (That is work) —0.50 1.07 0.71 1.388 0. 0. 988 
work (He worked for them) 0.06 0.97 0.75: 1.28, 0. 0.62 980 
work (He worked at it) 0.14 1.06 1.78 2.08 0. 1.20 990 
woRLD (He went out into the world) —1.03 0.94 0.05 1.399 0. 0. 991 
worry (He worried) —2.20 0.25 —164 275 0. 0. 992 
WORTH (It is worth it) 0.65 0.31 , 0.42 0.78 0. 0. 993 
WRITE (He wrote it) 0.51 0.21 0.25 0.61 0. 0. 904 
YEAR (It was years ago) 0.24 —0.20 -0.18 0.40 0. 0. 995 
YES (Yes, it is) 0.38 0.86  —0.35 101 0. 0. 996 
ҮЕТ (He has yet to do it) —1.45  -1.6 —0.51 1.93 0. 0. 997 
тоомо (He is young) 0.98 1.02 —2.60 2.96 207 0. 998 
YOUNGER (Не is younger) 0.18 0.74 —182 1.97 0.97 0. 999 


ined vob boit 


n RIEN 


uis wd (ро cry 
үү гҮ 
Pine o 


SLT. 


79, No. 9 


Whole No. 602, 1965 
Psychological Monographs: General and Applied 


DIAGNOSTICIANS VS. DIAGNOSTIC SIGNS: 
THE DIAGNOSIS OF PSYCHOSIS VS. NEUROSIS FROM THE MMPI’ 


LEWIS R. GOLDBERG 
University of Oregon and Oregon Research Institute 


For each of 861 MMPI profiles originally secured by Paul Meehl from 
hospitals and clinics throughout the country, 29 clinical judges 
nostic judgments along a neurotic-paychotic continuum. The 
these clinical judgments was compared with that achieved by 
MMPI scale scores, 8 scale ranks, 54 diagnostic signs and rules, 35 profile 
components, 19 linear regression analyses, 2 
and numerous actuarial tables. The accuracy of the judges and the actuarial 
indexes was analyzed (a) for each of the 7 samples, (b) for the set of samples 
for which criterion contamination was least likely, and (c) for the 
sample of 861 cases. A number of relatively simple actuarial indexes turned 
out to be more accurate than the best diagnostician. More importantly, the 
present study reversed some conclusions from previous analyses of the same 
data and showed that simple linear combinations of scale scores were more 


accurate than configural models, including the Meehl-Dahlstrom Rules. 


B.. the publication of Paul Meehl's 
influential book, Clinical vs. Statistical 
'rediction in Psychology (Meehl, 1954), 
the contribution of the clinician to the 
diagnostic process has begun to be scruti- 
nized closely; for a recent literature re- 
, see Gough (1962). While the bulk of 
pirical evidence has appeared to indi- 
cate the relative advantages of statistical 
predictions, Meehl (1959) has suggested 
six situational factors which could lead to 
superiority of clinieal judgments over 

ial ones. One of these hypothetical 
actors, the apparent ability of the clini- 
cian to combine information configurally, 
et been subjected to preliminary empiri- 
eal tes& (Lykken & Rose, 1963; Meehl, 
1959; Meehl & Dahlstrom, 1960). One 


* This study stems from a larger research project 

at Oregon Research Institute directed by Paul J. 

Hoffman and supported by Grant MH-04439 from 

« the National Institute of Mental Health, United 

States Public Health Service. The continuous as- 

sistance of Richard R. Jones is gratefully acknowl- 

(edged. The author wishes to thank Paul E. Meehl, 

who brought together the data analyzed in this 

* study, for permitting its reanalysis at Oregon Re- 

search Institute. The helpful comments of Paul 

peere, Jerry Wiggins, Leonard G. Rorer, and 

i Peabody to an earlier draft of this paper are 
 Sreatly appreciated. 


major aim of the present study is to explore 
further the accuracy of clinical judgments 
as compared to diverse actuarial ones for 
Hi. aiia highly configural diagnostic 


In addition, the present study was de- 
signed to provide base-line data for one 
major diagnostic problem against whieh 
new predictive indexes later can be com- 
pared. As efforts to construct more con- 
figural actuarial methods continue (eg. 
Hoffman, 1960), the incremental validity 
of such statistical indexes can be evaluated 
against those analyzed in the present 
study. Perhaps more importantly, the pres- 
ent study provides a base-line index of 
clinical diagnostic accuracy against which 
to gauge improvement resulting from the 
specific training of clinical judges.* 

A final aim of the present monograph is 
to answer two questions of a more applied 
sort: given the present state of psychiatric 

* А study in progress at Oregon Research Insti- 
tute, conducted by Leonard G. Rorer and the 
author, aims to discover the maximum cross- 
validity for this diagnostic problem of skilled 
‘clinicians, psychology graduate students, and naive 
Badges when subjects in each group have received 
months of daily intensive training with immediate 
feedback. 
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diagnosis and personality assessment, (a) 
how accurately can the diagnosis of neuro- 
sis vs. psychosis be determined from an 
MMPI (Minnesota Multiphasie Person- 
ality Inventory) profile and (b) what 
MMPI indexes are most valid for making 
this diagnosis? 


PROCEDURE 


Nature of the Diagnosis 


The criterion utilized in this study was the 
ascription of a psychotic, as opposed to a neurotic, 
diagnosis to a psychiatric patient. The use of such 
a diagnostic problem has been defended on the 
grounds that: “...the differences between psy- 
chotic and neurotic profiles are considered in 
MMPI lore to be highly configural in character, 
so that an atomistic treatment by combining 
single scales linearly should theoretically be a 
very poor substitute for a configural approach 
[Meehl, 1959, p. 104] While Meehl saw the 
major goal of his 1959 study as methodological 
rather than substantive, the uncovering of valid 
predictive procedures for the differential diagnosis 
of broad classes of psychiatrie patients could ulti- 
mately turn out to have important theoretical im- 
plieations. 


Patient Samples 


All data utilized in this study were originally 
collected by Meehl, who described his procedure 
as follows: 

Samples of MMPI profiles produced by adult 
male psychiatric patients were obtained from 
seven clinical sources around the nation, the 
diagnoses being restricted to psychosis or neu- 
rosis. Criterion contamination (such as knowl- 
edge of MMPI scores by diagnostician) varied 
from none at all (three samples) to unspecifiably 
high. Sample sizes varied from 42 to 200 with a 
median of 103, a total of 861 profiles in all. The 
percentage of psychotics in the samples varied 
from 37% to 64% with a median of 51%, the 
total psychotic incidence over the entire sample 
being 47%, very close to an even split on the 
criterion side [Meehl, 1959, p. 104]. 

In those instances (e.g., Sample B) in which 
а definite sample was known to exist on the 
basis of previous published reports by the origi- 
nal investigators (Hunt et al., 1948 [Hunt, Carp, 
Cass, Winder, & Kantor, 1948]) constant 
exclusion eriteria were applied to all samples to 
render the MMPI records and the clinical di- 
agnoses suitable for the present differential pur- 
pose and to bring MMPI application into line 
with eurrent practice. Cases of known or strongly 
suspected mental deficiency, organie brain dam- 
age, acute physical illness (e.g., fever delirium), 
or psychopathie personality were eliminated from 


consideration. Multiphasie profiles were rejected] 

as probably invalid if the validity scales showed 

а “?” > 60, L > 70, or F 2 80.... Whenever the 

original answer sheets were available to us, all 

scales were rescored, and the 7-transformation 
and profile plotting checked for accuracy [Meehl 

& Dahlstrom, 19607]. 

The seven samples combined, totaling 861 
psychiatrie patients, comprise one of the largest 
samples utilized in MMPI validational research. 
A description of each of the seven samples will be 
presented later, as the results for each sample are 
discussed in turn. 


Diagnosticians 


Thirteen PhD clinical psychologists, relatively 
experienced in MMPI interpretation, made diag- 
nostie judgments from each of the MMPI profiles. 
While some of these judges were on the faculty | 
of the University of Minnesota, the majority were - 
staff psychologists at hospitals and clinies in the 
Minneapolis area; for convenience, all 13 PhD 
judges will henceforth be called staff judges. In 
addition, 16 clinical psychology trainees (predoc- 
toral students at the University of Minnesota) also 
served as diagnosticians. All 29 clinical judges 
were secured by Meehl; their instructions have 
been described as follows: 

They were given the seven sets of profiles 
one at a time and instructed to sort the indi- 
vidual profiles within a sample on an 11-step 
foreed normal distribution from least to most 
psychotic. (They were told that the continuum 
could be thought of either as a complex psycho- 
logical dimension, or as a degree of probability 
of belonging to a category, depending upon the 
individual judge's views of nosology.) The 
only information given to the sorters was that 
the patients were males under psychiatrie care, 
all having received a neurotic or psychotic diag- 
nosis. The sorters did not know which samples 
were inpatient or outpatient, VA or non-VA, 
nor did they know the actual incidence of psy- 
chosis in any sample nor over all the cases 
After sorting, the judge was required to draw| 
a cutting line indicating the point on his dis 
tribution of cases at which he thought the 
psychotic diagnosis began to preponderate. This 
made it possible to treat each judge's sorting 
dichotomously as well as continuously, 4P 
also permitted the establishment of a ‘doubt 
ful’ region for comparison with those statistical 
methods which included a doubtful category 
[Meehl, 1959, p. 104]. 


“The sample descriptions have been deposited 
with the American Documentation Institute. Order 
Document No. 6330 from ADI Auxiliary Publica: 
tions Project, Photoduplication Service, Library © 
Congress, Washington, D.C. 20540. Remit in ad- 
vance $125 for microfilm or $1.25 for photocopie 
and make checks payable to: Chief, Photodupli- 
cation Service, Library of Congress. 
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! Diagnostic Signs 


“Sign” is here used to refer to any scale score or 
combination of scores, however simple or com- 
| plex, which can be specified precisely (eg, non- 
_ judgmentally). An operational definition of a 
| sign, in this sense, is any index which can be pro- 
gramed for a computer. MMPI signs were obtained 
for the present study from the literature on the 
MMPI and from personal communications with 
MMPI experts. In addition, signs were developed 
empirically by the author, utilizing the sample 
previously used by Meehl and Dahlstrom (1960) 
in deriving the Meehl-Dahlstrom Rules. 
Four classes of signs can be differentiated: 
1. Single-scale scores— While the scores on single 
MMPI scales are not considered signs in the usual 


- sense, their inclusion in this study allows a com- 


parison of more complex signs with these basic 
predietive indexes. 

2. Signs made up of some linear combination of 
single scales—' Тһе average elevation of the three 
scales comprising the Neurotic Triad, the arithme- 
tic difference between pairs of scale scores (e.g, 
Pt — Sc), and the Anxiety Index (Welsh, 1952) 
аге examples of signs utilizing a linear combination 
of single scale scores. 

3. Configural combinations of a few scales— 
Examples of signs of this type would include the 
number of clinical scales with T scores = 70, the 
Internalization Ratio (Welsh, 1952), and High 
Point Rules. 

1. Complex, and usually highly configural, rules 
or formulae— Where a sequential strategy for ar- 
riving at a decision is utilized (e.g, Meehl & Dahl- 
strom, 1960), or where a tally of a number of 
signs is used as an index (as in the Peterson Signs 
—Peterson, 1954a—or the Taulbee-Sisson Signs— 
Taulbee & Sisson, 1957), or where patterns of 
Scores are coded into "Profile Types" (e.g, Marks 
& Seeman, 1963), more complicated and often 
More time-consuming, diagnostic indexes can be 
computed, Examples of all four classes of MMPI 
Signs were compared in this study. Where simple 
Signs had been used in previous MMPI studies 
(eg, Modlin, 1947), these signs were included in 
the present analysis. Where more complex sets of 
Signs had been developed (e.g, Peterson, 1954a; 
Taulbee & Sisson, 1957), both the complex index, 
as well as each of the single components of the 
index, were included. х 

Table 1 lists the initial 65 signs analyzed in 

5 study. Since the MMPI profiles collected by 
Meehl included only the three validating MMPI 
Seales (L, P, and К) and the eight clinical scales 
(Hs, D, Hy, Pd, Pa, Pt, Sc, and Ma), signs which 
Utilized Mf or Si could not be analyzed. The 
Complicated Meehl-Dahlstrom Rules and Marks- 
ceman Profile Types were programed for an 
BM 7094 computer; other signs were programed 
for an IBM 1620 computer, using the computing 
formulae listed in Table 1. The original data, from 
Which all signs were computed, consisted of MMPI 
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scale T' scores, K-corrected where appropriate. The 
values for each sign were grouped into 10 intervals, 
so that each sign could be punched into a single 
column of an IBM card. This grouping procedure 
did not alter the shape of the original distribution 
of scores. 

The first 11 signs in Table 1 are the 11 MMPI 
scales, the basic components of all subsequent 
signs, each partitioned into 10 intervals as de- 
scribed above. Signs 12-20 each consist of a tally 
of the number of clinical scales falling beyond a 
particular T score. Sign 21 indicates the average 
elevation of the total profile, while Signs 22-24, 
respectively, indicate the average elevation of the 
neurotic, psychotic, and character disorder sets of 
clinical scales. Signs 25-27 express the ratio of 
each of the three components to the average ele- 
vation of the total profile; Signs 28-30 indicate 
the ratios of the components to each other. Signs 
31-33 express in turn the arithmetie difference be- 
tween each of the three scales—Pa, Sc, and Ma— 
and the Average elevation of the Neurotic Triad. 

Sign 34 expresses the degree to which the pat- 
tern of the three scales of the Neurotie Triad 
varies from the most extreme "conversion V” 
io the most extreme "depressive spike." Signs 
35-47 are those 13 (of the 16) Taulbee-Sisson 
Signs which do not involve Mf. Sign 48 is the 
number of these 13 Taulbee-Sisson Signs present 
in the profile. Each of the Peterson Signs is in- 
cluded among the 65 signs; Sign 49 indicates 
the number of Peterson Signs present in the pro- 
file. Welsh's Anxiety Index and Internalization 
Ratio are expressed as Signs 47 and 48, followed 
by the two major formulae, beta and delta, which 
must be computed for the Meehl-Dahlstrom Rules. 
Four signs whieh Harrison Gough indicated help 
him in this diagnostic task were included, as well 
as three signs contributed by Victor Lovell. 

Sign 61, the simplest sign utilized in this study, 
classified a profile as neurotic or psychotic solely 
on the basis of its highest scale score. If a pa- 
tient's highest score was on a scale originally de- 
rived from neurotic subjects (Hs, D, Hy, or Pt) the 
profile was classified neurotic; if the high point 
fell on any other scale, the profile was classified 
psychotic. Sign 62 is a similar classification system 
based upon the profile’s lowest score; its coding 
is the reverse of Sign 61. 

Sign 63, the Two Point Rules, was derived 
by the author by classifying all profiles according 
to the first two digits of their Welsh (1948) Code 
and calling the profile either psychotic or neu- 
rotic depending upon the relative frequency of 
each diagnosis (for each Two Point Code) in the 
derivation sample. It is important to remember 
that all of the signs which are unique to this 
study (Signs 61-63) were derived (and cutting 
scores obtained) on the original derivation sample 
from which Meehl and Dahlstrom (1960) derived 
their sequential rules. Validity coefficients and 
hit-rates for all signs were cross-validated on differ- 
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TABLE 1 
THE INITIAL 65 DIAGNOSTIC SIGNS 


No Sign Computing formula 

1 L (L/5) — 7 

2* F* (F/6) — 7 

3 K (K/6) — 5 

4° Hs (Hs/10) — 2 

5° D (D/10) — 2 

6° Hy (Hy/10) — 2 

T Pd (Pd/10) — 2 

8* Pax (Pa/10) — 2 

9 Pt (Pt/10) — 2 

10* 8с (Sc/10) — 2 

11* Ma* (Ma/10) — 2 

12 No. of clinical scales > 55 

13 Мо. of clinical scales = 60 

14 Мо. of clinical scales > 65 

15 Мо. of clinical scales > 70^ 

16 No. of clinical scales = 75 

17 No. of clinical seales > 80^ 

18 Мо. of clinical sclaes < 50 

19 No. of clinical scales < 45 

20 Мо. of clinical scales < 40 

21 8 clinical scales: Mean“ (X) [(Hs + D+ Hy + Pd + Pa + Pt + Sc + Ma)/80] —2 
22* Neurotic Triad: Means’ 3 (N) [(Нз + D + Hy)/30] — 2 

23* Psychotic Triad: Mean? (P) Ра + Pt + 8с) /80] — 2 

24* Ма + Pd: Mean (C) [Ма + Pd)/20] — 2 

25*  Neurotie Slope (N/X) (Sign 22/Sign 21)-5 

26*  Psychotic Slope (P/X) (Sign 23/Sign 21)-5 

27* Ма + Pd Slope (C/X) (Sign 24/Sign 21)-5 

28* Neurotic Triad/Psychotie Triad (N/P) ^ (Sign 22/Sign 23)-5 

29*  Neurotie Triad/(Ma + Pd) (N/C) (Sign 22/Sign 24)-5 

30 Psychotic Triad/(Ma + Pd) (P/C) (Sign 23/Sign 24)-5 

31* Pa — Neurotic Triad (Pa — N) ([Pa — 1/3(Hs + D + Ну)]/10) + 5 
32* Se — Neurotie Triad (Sc — N) ([Sc — 1/3(Hs + D + Ну)]/10) + 5 
33* Ma — Neurotic Triad (Ma — N) (Ма — 1/3(Hs + D + Ну)]/10) + 5 
34 Spiking of Neurotic Triad VD — 1/2 (Hy + Hs)] (D — Hs) (D — Hy)/10 + 3 
35* Hs — Hy: Taulbee-Sisson Sign 1* [(Нз — Ну)/10] + 5 

36* Hs — Pd: Taulbee-Sisson Sign 2° [(Hs — Pd)/10] + 5 

37* Hs — Pa: Taulbee-Sisson Sign 3¢ [(Hs — Pa)/10] + 5 

38* Hs — Pt: Taulbee-Sisson Sign 5* [(Нз — Pt)/10] + 5 

39* Hs — Sc: Taulbee-Sisson Sign 6* [(Hs — 8c)/10] + 5 

40* Hs — Ma: Taulbee-Sisson Sign 7° [Is — Ma)/10] + 5 

41* D — Pd: Taulbee-Sisson Sign 8° [(D — Pd)/10] + 5 

42* D — Pa: Taulbee-Sisson Sign 9° [(D — Pa)/10] 4- 5 

43* Ну — Pd: Taulbee-Sisson Sign 10° [(Hy — Pd)/10] + 5 

44* Hy — Pa: Taulbee-Sisson Sign 12* [GIy — Pa)/A0] + 5 

45* Ну — Ma: Taulbee-Sisson Sign 13* [(Ну — Ma)/10] + 5 

46* Pt — Pa: Taulbee-Sisson Sign 15* {(Pt — Pa)/10] + 5 

47* Pt — Se: Taulbee-Sisson Sign 16* {(Pt — 8c)/10] + 5 

48* Мо. of Taulbee-Sisson Signse 3 i i 

И E g (No. of 13 possible signs)/1.5 


No. of 6 possible signs 
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TABLE 1—Continued 


No. Sign 


Computing formula 


5 Anxiety index‘ 

51* Internalization ratio‘ 

52° Beta“ 

53* Delta: 

54 Psychotic Triad Scatter 
55 Gough Pd Sign* 

56 Gough Anxiety Sign 

57* Gough Hs Sign^ 


58* Neurotic Triad — Psychotic Triad! (Ñ — P) 
§ Pa — Pdi 
60 Hs — Di 


61* High Point Rules 
62* Low Point Rules 
63* Two Point Rules 


04*  Meehl-Dahlstrom Rules* 
65 Магкѕ-Ѕеетап Profile Types! 


* Peterson (1954b). 

t Oskamp (1962). 

* Modlin (1947). 

å Rueseh & Bowman (1945). 

* Taulbee & Sisson (1957). 

f Welsh (1952). 

* Meehl & Dahlstrom (1960). 

^ Gough (personal communication). 
! Lovell (personal communication). 
i Marks & Seeman (1963). 


(4D + 3Pt — 2Hs — 2Hy + 90)/60 

(Is + D + Pt)/(Hy + Pd + Ма)|-10) — 5 
(Pt + Se — Hs — D + 100)/20 

(Pd + Pa — Hs — Hy + 100)/20 

[Ра? + PE + Sct — 1/3(Pa + Pt + 8c)1]/50 
(Ра — 1/2(Pa + 8с)]/10) + 5 

[(L + Hs + Pa — D — Р) /10) + 1 

(Hs + Ma — Pd)/10 


(Sign 22 — Sign 23) 4- 5 
[(Pa — Pd)/10] + 5 
[Hs — D)/10] + 5 


Hi Code1,2,3,7 = N; 4,6,8,9 = P 
Lo Code 4, 6,8, 9 = N; 1,2,3,7 = P 
(see text) 
(see text) 
(see text) 


*Sign valid (p < .01) in Meehl-Dahlstrom derivation sample. 


ent samples than those from which the signs were 
originally derived. 

Finally, the Meehl-Dahlstrom Rules themselves 
Were included in this study, as well as the recent 
Marks-Seeman Profile Types (Marks & Seeman, 
1963). For the latter system, profiles were coded 
into one of the 16 types (plus a category of 
unclassifiable”) and called neurotic or psychotic 
on the basis of the relative frequency of each diag- 
nosis for each Profile Type in the derivation 
sample, While the 65 signs utilized in this study 
obviously do not represent all possible combina- 
tions of MMPI scores, they are meant to be 
Teasonably exhaustive of the signs reported in 
the literature. In addition, they appear to be 
sufficiently heterogeneous in form to allow some 
estimate of the likely predictive accuracy for 
this diagnostic problem of each of the major modes 
of combining MMPI scores. 

Because many of the 65 signs were originally 
Proposed for a diagnostic problem different from 
that under study, all 65 signs were first validated 
"m the Meehl-Dahlstrom derivation sample (N = 

2). Forty-three signs achieved significant (p < 


01) validity coefficients in the derivation sample; 
these signs are noted by an asterisk in Table 1. 


Analyses 


The validity of each of the 65 signs and each 
of the 29 clinical judges was ascertained for each 
of seven cross-validation samples, as well as for the 
total sample of 861 cases. Since the group of 
clinical judges included both staff clinicians and 
trainees, it was possible to compare the validity 
of the signs with that of the average staff clinician 
as well as with that of the average trainee.‘ In ad- 
dition, the accuracy of the composite judgments 
of both staff and trainee judges was compared 
with the signs. Finally, the validity of the signs, 
as well as that of the judges, was compared with 
other indexes, including linear regression equa- 


‘The average validity coefficient was computed 
by transforming the original correlations to Z 
scores, calculating the average Z, and then convert- 
ing this average back to an r. In all cases, this 
average r was equal to the simple arithmetic 
average of the 7’s (for the 2-digit figures reported). 
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tions, older configural methods (e.g., Lykken & 
Rose, 1963), and some new configural techniques. 


RESULTS 

Table 2 presents the validity coefficients 
for the diagnosticians, the 11 single scales, 
and the 13 signs whose criterion correla- 
tions were highest for the total sample of 
861 patients. Validity coefficients’ are in- 
cluded for each of the seven samples as well 
as for the total patient group. The number 
of patients and the percentage of psychotics 
in each sample are listed at the top of 
Table 2. The ranges of the validity coeffi- 
cients for the 13 staff judges, the 16 trainee 
judges, and all 29 diagnosticians are listed, 
as well as the validity coefficients for the 
average staff judge, the average. trainee 
judge, and the average of all 29 judges. In 
addition, the diagnosticians’ judgments 
were summed to produce a composite pre- 
diction for the staff group, the trainee 
group, and for all 29 judges; the validity 
coefficients of these composite predictions 
are also presented in Table 2. Finally, 
the percentage of the 29 judges (and the 
percentage of the 43 “valid” signs) with 
validity coefficients significantly greater 
than zero (p < .01) are listed. 


ЗА validity coefficients were calculated by 

computer, utilizing the raw score formula for the 
product-moment correlation. Since the great ma- 
jority of the correlations involved a continuous 
predietor variable and a dichotomous criterion, 
most of the coefficients are point biserial correla- 
tions. For indexes such as the Meehl-Dahlstrom 
Rules, which classify patients trichotomously, 
neurotic diagnoses were coded “1,” indeterminate 
diagnoses were coded “2,” and psychotic diagnoses 
were coded “3.” This procedure permits higher 
validity coefficients than those achieved by a di- 
chotomous classification, with indeterminates be- 
ing randomly assigned to the two determinate 
categories. For indexes such as the High Point 
and Low Point Rules, which classify patients di- 
chotomously, the resulting validity coefficients 
are phi's. 
_ It is important to realize that since all of the 
Judges’ predictions were made on a forced-normal 
continuous rating scale, the judges’ validity co- 
efficients (point biserials) do not involve the 
cutting point between psychotic and neurotic pa- 
tients and therefore are not affected by any inter- 
judge differences in assumed population base rates. 
The accuracy percentages, on the other hand, take 
such differences into account. It is for this reason 
that both indexes are reported. 


As Table 2 indicates, while there were 
virtually no differences between staff and 
trainee groups in the validity of their 
diagnoses, there were great individual dif- 
ferences among diagnosticians. The best 
judge achieved a validity coefficient for 
the total sample approximately equal to 
that of the best sign, while the worst judge 
was outperformed by six of the single 
scales! Since at any time, and for any 
sample, one would not be able to identify 
the “best” judge, the average judge pro- 
vides the most meaningful comparison 
with the statistical indexes. The validity 
of the average staff judge and average 
trainee judge over all seven samples was 
.28. While the validity of the average 
diagnostician was higher than that of the 
best single scale (Sc, with a validity coeffi- 
cient of .19), it was considerably below 
that achieved by even such a simple sign 
as Sc minus Neurotie Triad (.36). The 
validity coefficients for the composite 
judgments were, in general, higher than 
those for the average judge and lower than 
those for the best judge. 

Well over a quarter of the signs outper- 
formed the average judge, including a sign 
as simple as the High Point Rules. More- 
over, three of the signs, including Sc minus 
Neurotic Triad, were more valid than the 
composite judge. While the sequential 
Meehl-Dahlstrom Rules achieved the high- 
est validity coefficient of any sign for tht 
total sample, their incremental validity 
(Sechrest, 1963) over more simple sign 
was extremely slight. 

Table 3 lists the hit-rates for judges ant 
diagnostic signs when every case was diag 
nosed. For comparison purposes, the same 
signs as in Table 2 are listed. When û 
diagnosis was made for every case, diag 
nosticians obtained a 62% hit-rate over 4 
samples. The most valid sign over all sam 
ples was Sc minus Neurotic Triad (with ай 
accuracy percentage of 67), while the sim- 
ple High Point Rules predicted with 66% 
accuracy. In general, for any of the better 
signs, approximately two-thirds of the Df 
tients were correctly classified (roughly 
one-half would have been correctly identi- 
fied by chance). 
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TABLE 2 
COMPARISON or VALIDITY COEFFICIENTS FOR DIAGNOSTICIANS V8. Dtacnostic 
Sample 
A B EEE D ا“‎ 
N 92 77 103 42 181 
% Psychotic 64 52 47 | 52 50 
Diagnosticians | 
13 Staff: range 34—.50| .04—.26/.05—.36/— .08—+. 33) .10—+. 34) 
Average staff 43 17 .27 12 .22 | 
Composite staff 51 .21 .32 ‚16 .28 
16 Trainees: range 18—.56,— .10—.32/.04—.45/— .12—.43/.01—.29 
Average trainee .42 15 .25 a ‚18 
Composite trainee .51 .20 31 .22 .23 
АП 29 judges: range 18.56) — .10—.32.04—.45| — .12—.43].01—.34 
Average judge .43 .16 .26 .15 .20 
Composite judge .52 .20 .32 19 .26 
% of 29 judges r > 0; p < 97 7 59 E78 62 
.01 
Diagnostic signs 
Meehl-Dahlstrom Rules .46 .22 .38 .21 4 
Two Point Rules .39 .35 ES .34 .37 
бе — Neurotie Triad .42 .28 E 45 ML 
No. Taulbee-Sisson 
Signs 81 .26 4 .33 ‚26 
Pt — Ses .36 .31 -36 .50 .25 
High Point Rules .34 .26 45 18 .36 
Hy — Pas .31 .40 .36 -31 -13 
No. Peterson Signs .52 .16 .28 24 -15 
Ñ- pa 41 .20 .37 .25 .06 
Pa — Neurotie Triad „21. 87 .36 .24 16 
N PS .45 .18 -38 .19 .05 
Hs — Sc 45 .22 .33 .28 | —.02 
Delta .32 .30 .37 .11 .15 
of 43 si y Н 
% i gnsr»0;p« e: Р т E د‎ 
Single s 
< vec .04 19 — .02 .09 13 
F .30 15 .26 .12 —.14 
K 04 .04 .00 —.01 .06 
Hss =i 19 26 -26 -03 
Ds 421 19 .28 22 3 
Ну» .00 23 28 .40 11 
Pd 30 | —.05 .08 | —.26 19 
Ра 84 19 аза 5:9 -00 
Pi (oa | AS 0931 ET E20 
Se AL —.01 15 .06 —.01 
Иа ds | -.02 18 29 19 
Significant r (p < .01) 27 .30 -25 4 29 


AERLE sik 


E 


Мама 


БЕБЕББЬБЕЕ BEB 


P 
2 


SEBSRRRRBE Be 


8 


Note.—Numbers in boldface indicate highest coefficient for each sample. 
* Keyed negatively with criterion. 
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TABLE 3 
Cowramieos or Acctmaev Peacewraces rom Diaaxoarsctaxs va. DiaüNOSTIO Зол » 
Живх Ашы. Cases Wens Draoxoseo 


= „a dana = 
| Semple 
^ Duk -o T-S E Li ‹ ты 
-—— —— | س‎ ——À— — — 
5 илт | 2 181 166 200 м1 
“¢ Veyehotie Lr 82 47 52 p 43 7 a 
f | I 
M май: range м - 7952 ~. ot - OM — 04/81 — 0657 — 75 50 — 08 
Average май Си 61 55 | | e | о | @ 
16 Trainee: range м —. -* - 7140 ¬ 6052 — 6252 — 07,56 ¬ 78 55 — € 
Average trainee | 6 5 | | 70 _ 9 
All 29 judges: range мз - 6448 — 7140 — 6952 — 6451 — 67,56 — 78 55 — € 
Average judge | @ м 60 56 | 70 02 
| | | | 
-Dahlstrom Rules | e 60 66 в | вз 64 74 [7] 
Two Point Rules 71 bh 68 “ 65 61 74 07 
бе — Neurotic Triad | 67 “4 68 00 | e | 66 7 67 
No. Taulbee-Sisson Signs w 61 69 60 61 61 72 64 
- Se 70 66 66 78 58 | 61 70 65 
High Point Rules 67 5% 73 62 70 63 68 % 
Hy — Pa 61 65 E] 60 5 63 64 61 
No. Peterson Signa T6 58 53 62 55 48 7 60 
5-р 65 54 63 57 57 | 4 72 63 
Pa — Neurotie Triad 5 66 64 60 56 65 66 62 
т 65 53 6% 57 56 63 72 63 
Hs — Se 74 61 61 62 51 55 66 61 
5 62 67 45 60 63 70 | e 


| 


Note.—Numbers in boldface indicate highest percentage for each sample. 


Table 4 presents the accuracy percent- 
ages for diagnosticians and diagnostic signs 
when an "indeterminate" category was 
— The percentage of each sample 

called indeterminate was that so classified 
by the Meehl-Dahlstrom Rules, For each 
QI nOn meld wish 

ап indeterminste band were set to 
QU d e P do 
ple, so the hit-rates reported in Table 4 
are again cross-validated ones. The Meehl- 
Dahlstrom Rules have been given a slight 
advantage in this comparison, since the 
percentage of indeterminate cases was prob- 
ably more optimum for this sign than for 
any of the others. While the Meehl-Dahl- 
strom Rules did in fact achieve higher hit- 
rates over the total sample, cag orgs 
accuracy over far simpler signs (eg, Sc 
minus Neurotic Triad) was again qui 
slight slight (3%).° In general, when the MMPI 


The а accuracy percentages for the Meehl- 
Dahlstrom Rules presented in Table 4 differ 


was used to make this differential diag- 
nosis, with roughly one-third of the patientt 
not diagnosed, the overall hit-rate for diag- 
nosed patients was still less than three 
quarters correct. | 


slightly from the corresponding figures Tol 
Meehl and Dahlstrom (1960). For Samples 4 
Me the two sets of figures are identical. For 
C, D, and F the two sets of figures 

1%, while for Sample B the figures айе 
. These very slight differences could sies 
rom discrepancies between the compute 
lated classifications used in the present study 


nosticians em therefore, analyzed in the 
study; the cy nonetheless is only 
Sample K from the earlier study was not 12 
by the diagnosticians, so this sample was excl 
from the present study. Consequently, the 
cross-validation sample used by Meehl and рав 
strom (N = 988) was larger than that E 
in the present study (N = 861). However 
discrepancy between accuracy percentages for 
two total samples was only 2%. 
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10 Lewis R. 
This sample had the largest psychotic 
base rate (64%) of any in the study. The 
single best predictor for this sample was 
one trainee, this being one of two samples 
in which some actuarial sign did not outper- 
form all judges. The most accurate diag- 
nostician had a validity coefficient of .56, 
achieving a 77% hit-rate when all cases 
were diagnosed and an 85% hit-rate when 
an indeterminate category was permitted. 
The best actuarial index was the Peterson 
Signs, which had been derived on this sam- 
ple. It is important to realize that the 
Peterson Signs were not being cross-vali- 
dated on this sample and that this was the 
only sample for which this index bested all 
other signs. 

However, four of the signs listed in 
Table 2, in addition to the Peterson Signs, 
outperformed the average judge (r = .42): 
the Meehl-Dahlstrom Rules (.46), Neu- 
rotic Triad/Psychotie Triad (45), Hs 
minus Sc (.45), and Sc minus Neurotic 
Triad (.42). Of the 65 signs analyzed in 
the study, 25 had validity coefficients = 
-30 for this sample. Interestingly, four of 
the clinical seales functioned fairly well as 
predictors by themselves: Sc, for example, 
had a validity coefficient of .41, while F, 
Pt, and Pa all had validity coefficients 
above .30. 

The eriterion for Sample A, subsequent 
hospitalization as a psychotic patient in 
Minnesota, differs considerably from the 
criterion of concurrent. psychiatric diagno- 
sis used for all other samples; therefore, 
no direct cross-sample comparison of these 
findings is available. However, findings 
from this sample appear similar in many 
respects to those from Sample G, these 
being the only samples in which any diag- 
nostician outperformed the best sign, and 
these being the only samples in which the 
Peterson Signs and Hs minus Sc predicted 
well. Since Sample G is also from a Minne- 
sota Veterans Administration installation, 
the generality of the findings from these 
two samples to patient populations outside 
Minnesota Veterans Administration set- 
tings is suspect. 
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Sample B 
Sample B has been described as follows: 


This is a sample from the cases previously re. 
ported by Hunt et al. (1948), veteran in-patients 
tested at the Palo Alto V.A. Hospital. These 
diagnoses were also uncontaminated, the MMPI 
profiles having been locked in the file until the 
final diagnostic decision regarding the patient had 
been made. As (rare!) evidence of criterion relia- 
bility, the authors state that one of them, on the 
basis of listening to a reading of unidentified diag- 
nostic summaries and histories, classified each case 
as psychotic or non-psychotie and agreed with 
the official hospital dichotomy in 94% of the cases 
[Меећ & Dahlstrom, 1960]. 


It is significant to note that for this 
sample, where the diagnoses were com- 
pletely uncontaminated by MMPI infor- 
mation, the diagnosticians did very poorly. 
Validity coefficients for the 29 clinicians 
ranged from —.10 to .32, with a mean 
value of .16 (a correlation not significantly 
different from zero for a sample of this 
size). All of the signs listed in Table 2 
outperformed the average clinician; the | 
best sign for this sample was Hy minus 
Pa (т = .40). The Meehl-Dahlstrom Rules | 
did worse in this sample than in any other, . 
the correlation (.22) not differing sig- 
nifieantly from zero. Only 9 of the 65 signs 
had validity coefficients = .30, the lowest 
number of such signs for any of the samples 
under study. The single MMPI scales did 
not fare well, although the three scales of 
the Neurotic Triad each outperformed the 
average clinician. Diagnostic accuracy for 
this sample, using the best sign and allow- 
ing an indeterminate category, was only 
71%, dropping to 66% when all cases had 
to be diagnosed, 


Sample C 
Sample C has been described as follows: 


These were patients seen on the in-patient 
psychiatric service of the University of Minnesota 
Hospitals between 1951 and 1955 inclusive. 
cases in the files meeting the inclusion standard 
and for whom the MMPI had been given within 
five days of admission and prior to any shock 
treatment were included in the sample. It is safe 
to assume that the MMPI profile was at least 
available to the diagnosing psychiatrist or st 
conference personnel on all of these cases, a} 
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though the extent to which it contributed to 
diagnosis would vary depending upon the orienta- 
tion of the staff member, the clarity of the 
diagnosis on clinieal grounds, the professional 
competence and aggressiveness of the particular 
psychologist to whom the patient had been as- 
signed, and so forth. 'The authors can state on the 
basis of personal experience on this psychiatric 
service that the degree of contamination of indi- 
vidual diagnoses would vary from practically none 
at all at one extreme to a few cases in which the 
psychologist’s reading of the MMPI was practically 
determinative of the diagnosis ultimately given 
[Meehl & Dahlstrom, 19607]. 


With the exception of the two Minnesota 
Veterans Administration samples (A and 
G), diagnosticians did better on this sam- 
ple than any other. Nevertheless, the best 
predictor was the simple High Point Rules, 
with one trainee approximating the accu- 
racy of this sign (r = .45). This was the 
only sample for which the High Point Rules 
performed as the best predictor. Of the 65 
signs utilized in this study, 24 achieved 
validity coefficients = .30. All of the signs 
listed in Table 2 as well as four of the 
single scales (Hy, D, Hs, and F) outper- 
formed the average judge (.26). Roughly 
three-quarters of this sample were correctly 
diagnosed by the High Point Rules; pro- 
viding an indeterminate category did not 
improve diagnostic accuracy for this index. 

While this sample apparently came from 
the same hospital population used as the 
derivation sample for the Meehl-Dahl- 
strom Rules (and the indexes unique to 
this study), the Meehl-Dahlstrom Rules 
did not hold up so well on this direct cross- 
validation as did the much simpler High 
Point Rules. The failure of the Meehl- 
Dahlstrom Rules to outperform other in- 
dexes on a sample virtually identical to 
that from which they were constructed is 
certainly one of the most enigmatie find- 
Ings of this study. 


Sample D 1 
Sample D has been described as follows: 


This sample consists of the admissible subset 
of cases from the data reported by Rubin (1948), 
emg chiefly World War II veterans seen as in- 
Patients at the У.А. Mental Hospital in Chillicothe, 
hio during 1946-47. The marked shrinkage in 
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sample size between Rubin's published N (61) and 
the present is due to а combination of factors, the 
most important two of which are the excessively 
high incidence of invalid records (chiefly very 
high ^?" scores) and the sizable number of double 
or triple diagnoses recorded by Rubin's psychiatric 
colleagues, making it impossible to make a mean- 
ingful decision upon the data available to us as 
to where a patient belonged on the criterion side. 
The exclusion of doubtful cases was of course based 
entirely upon the examination of a list of diag- 
nostic data prior to consulting the MMPI profile 
[Meehl & Dahlstrom, 19607]. 


This sample was by far the smallest 
analyzed in this study, and therefore its 
findings should be generalized only with 
extreme eaution. Diagnosticians had their 
most difficulty with this sample, achieving 
an average validity coefficient of only .15 
(far below the correlation of .35 necessary 
for a statistically significant difference 
from zero). The best predictor was Pt minus 
Sc, achieving a validity coefficient of .50; 
however, this was the only sample for 
which this index was the best predictor. 
Virtually all of the signs listed in Table 2 
outperformed the average judge (although 
in this small sample, many of the validity 
coefficients were not significantly different 
from zero). 

Twelve of the 65 signs had validity 
coefficients = .30. Among single scales, Hy 
was the best predictor (.40), achieving a 
validity coefficient greater than most of 
the more complex signs. This was the only 
sample on which Ma had a sizable validity 
coefficient (.29). The average clinician per- 
formed at approximately the chance level 
for this sample, diagnosing 56% of the cases 
accurately when all cases were diagnosed, 
and barely raising his hit-rate when in- 
determinacy was permitted. Seventy-eight 
percent of these patients, however, could 
be correctly identified by Pt minus Sc, 
with no great increase in accuracy when 
an indeterminate category was permitted. 


Sample E 
Sample E has been described as follows: 


These profiles were made available through the 
kindness of Timothy Leary and are from the 
population employed in the Kaiser Foundation 
Research Project. The psychoneurotic group in- 
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cluded overtly neurotic patients who went subse- 
quently into psychotherapy, characterized by overt 
anxiety and readiness to accept treatment, together 
with a second group of patients referred from a 
medical clinic for psychosomatic reasons, most of 
whom did not subsequently enter treatment. The 
psychotic cases are patients, chiefly schizophrenics 
from the Stockton State Hospital, about one- 
quarter of whom were originally seen in the 
Kaiser Foundation Hospital out-patient clinic. 
The criterion diagnoses of these patients are 
(personal communication from Dr. Leary) uncon- 
taminated [Meehl & Dahlstrom, 19607]. 


This sample, the second largest in the 
study, had virtually a perfect split between 
psychoties and neurotics. Judges, however, 
did not do well in diagnosing patients in 
this sample, achieving an average validity 
coefficient of only .20. The best predictor 
for this sample was the Two Point Rules 
(.37), while the High Point Rules achieved 
а validity coefficient of .36. The Meehl- 
Dahlstrom Rules achieved a validity coeffi- 
cient of .34. 

Only 9 of the 65 signs achieved validity 
coefficients = .30. Among single scales, D 
was the most valid (.23). When all cases 
in this sample were diagnosed, the simple 
High Point Rules achieved the highest hit- 
rate (70%), better even than the Meehl- 
Dahlstrom Rules (63%). When a third of 
the cases were permitted to be indetermi- 
nate, the Meehl-Dahlstrom Rules (71%) 
barely outperformed the High Point Rules 
(69%). 


Sample F 


Sample F, from the Langley-Porter Hos- 
pital in San Francisco, has been described 
as follows: 


These were in-patients diagnosed psychoneu- 
rosis (excluding character neuroses and mixed 
neuroses with schizoid features) or psychotics di- 
agnosed schizophrenia. Dr. Lingoes in providing 
us with these data warned that some of the psy- 
chotic patients were in remission or receiving EST 
at the time of MMPI administration, so that their 
curves will not reflect the patient's psychopathol- 
ogy at admission. The diagnoses were discharge 
diagnoses reached by joint cooperation of the 
psychiatric resident and the chief of service. No 
specific measures to prevent contamination were 
employed, so that some unknown degree of in- 
fluence of the test results upon the discharge diag- 


nosis must be allowed for [Meehl & D h 
pad [Mee ahlstrom, 
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Judges performed poorly on this sample, 
the third largest in the study, achieving an 
average validity coefficient around .20. All 
of the signs listed in Table 2 outperformed 
the average judge, the best sign being the 
difference between Neurotic Triad and 
Psychotic Triad elevations, This was the 
only sample, however, for which this index 
was the most effective. 

Ten of the 65 signs had validity coeff- 
cients = .30. Single-scale validities were 
not high, with Neurotic Triad scales 
achieving the highest validity coefficients, 
The Meehl-Dahlstrom Rules did not diag- 
nose very accurately for this sample (.32), 
seven simpler indexes performing more 
validly. While the average judge achieved 
a hit-rate of approximately 60% when all 
cases were diagnosed, Sc minus Neurotic 
Triad managed a hit-rate of 66%. The 
Neurotic Triad—Psychotic Triad difference 
was most accurate when an indeterminate 
category was permitted, diagnosing roughly 
three-quarters of the remaining sample 
correctly. 


Sample G 
Sample G has been described as follows: 


This sample comprises the cases remaining 
after exclusion of invalid profiles from a sample 
constructed by Dr. Albert Rosen from the files 
of the psychiatric in-patient service at the VA. 
Hospital, Minneapolis, Minnesota. Dr. Rosen con- 
cocted this sample in connection with his research 
on the development of “pure” MMPI scales...$0 
that very strict diagnostic criteria were employed 
in his inclusion of a case. Any patient for whom 
a secondary diagnosis in addition to the dis 
charge diagnosis was indicated in the chart, or who 
had been otherwise diagnosed in some previous 
hospital or clinic or while in the service, was €x- 
cluded as “doubtful” or “mixed.” This kind of 
sample provides a better test than most from the 
standpoint of concurrent validity; unfortunately, 
the recording of a secondary or alternative diag- 
nosis in this hospital often reflects the impact of à 
disharmonious MMPI finding. Hence, the Rosen 
sample is good because of its criterion reliability, 
but bad because of likelihood of rather high con- 
tamination.... The 74 psychotic patients in this 
sample were all diagnosed paranoid schizophrenl@ 
...(Meehl & Dahlstrom, 19607). 

The sample analyzed in this study i 
cluded all 74 psychotic patients plus 128) 
of the 199 neurotics from Rosen’s origina 


sample (Rosen, 1952; Rosen, 1958). 
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This sample, the largest under study, had 
the smallest psychotic base rate. The best 
predictor was one trainee (.55). While the 
Meehl-Dahlstrom Rules achieved a validity 
coefficient of .52, the simple index, Sc minus 
Neurotic Triad, was equally valid. The 
diagnosticians performed better on this 
sample than any other, yet 7 of the signs 
listed in Table 2 were more valid than the 
average judge (.43). Twenty-five of the 65 
signs achieved validity coefficients = .30. 
Even the single scales did surprisingly well 
for this sample, Sc and Pd achieving 
validity coefficients of .42. In this sample, 
single-seale validities were of the same 
order of magnitude as that of the average 
diagnostician! 

When all patients were diagnosed, judges 
achieved a 70% hit-rate, which increased to 
76% when an indeterminate category was 
permitted. The Meehl-Dahlstrom Rules 
achieved an 82% hit-rate for diagnosed 
cases when indeterminate cases were per- 
mitted, while the Taulbee-Sisson Signs 
performed almost as well (79%). In gen- 
eral, roughly three-quarters of this purified 
and contaminated sample were correctly 
diagnosed by a number of indexes, includ- 
Ing the average diagnostician. 


Intersample Comparisons 


Tables 2-4 show great differences in the 
accuracy of the signs for different sam- 
ples, An analysis of variance for 11 of the 
Most valid diagnostic signs across the 
Seven samples indicated that differences 
etween signs were statistically significant 
at the .05 level, while differences between 
Samples were significant well beyond the 
01 level. A similar analysis of variance 
for the diagnoses of the 18 staff judges 
across the seven samples indicated non- 
Significant differences between judges, but 
highly significant (p < .01) differences 
between samples, For the 16 trainee judges, 
ifferences among trainees as well as differ- 
ences between samples were significant 
beyond the .01 level. When the validity 
Coefficients of the 11 single MMPI scales 
Were utilized to compute indexes of simi- 
larity between samples, Samples A and G 
the Minnesota Veterans Administration 
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samples) appeared most similar. Samples 
B through F, while showing some similari- 
ties, showed much more variance specific 
to the installation. It is important to note 
that only on Samples A and G did the 
average diagnostician achieve even a 
modest degree of diagnostic accuracy. 

To examine the consistency of diag- 
nosticians’ accuracy across the seven sam- 
ples, all 29 judges were ranked on accuracy 
within each sample. Inspection of these 
rankings revealed that the best judge over 
all samples, a trainee, was most accurate 
(Rank 1) on only one sample (Sample A), 
and he ranked as low as 9 (Sample E). 
The second-best judge over all samples 
was most accurate on only one sample 
(Sample G), and he ranked as low as 14 
(Sample B). The third-best judge over all 
samples was most accurate on two samples 
(Samples C and D) but ranked 28 (second 
from the bottom) on Sample A. The worst 
judge over all samples achieved the lowest 
rank on four samples; on his best sample he 
ranked 17. The second-worst judge over all 
samples was low judge on only one sample, 
ranking 18 on his best sample. To sum- 
marize, no judge ranked first on more than 
two samples, although the worst judge 
ranked lowest on four samples. Perhaps of 
even more significance, the four best and 
two worst judges were all trainees. The 
fact that one judge ranked first on two 
samples and ranked 28 on another sample 
indicates the difficulty that would be en- 
countered in identifying the most accurate 
judge for any particular setting. 


AppITIONAL ANALYSES 


The data presented so far have indi- 
cated that, in general, simple diagnostic 
indexes (calculable by a clerk) were more 
valid than the clinical judgments of experi- 
enced clinicians for this diagnostic task. 
For example, the simplest of all the in- 
dexes—the High Point Rules—outper- 
formed the average diagnostician in five 
out of seven samples (and in the total 
sample of 861 cases). The average judge 
achieved a higher validity coefficient than 
the High Point Rules only in the Minnesota 
Veterans Administration samples (A and 
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TABLE 5 
Diacnostic Accuracy or EacH HiGH 
Рогхт CODE ror тне TOTAL 
БАМРІЕ (№ = 861) 

Sale 2 mr 
Hs (1) 14 N 74 .104 
D (2) 33 N 64 211 
Hy (3) 7 N 63 ‚044 
Ра (4) 14 E 66 .092 
Pa (6) 2 E 94 .019 
Pt (7) 11 N 57 .063 
Sc (8) 14 P 7 .102 
Ma (9) 5 p 62 .031 

Overall 100 66 .66 


G). In general, the simpler signs were ap- 
proximately as valid over the total sample 
as the complex indexes derived by Meehl 
and Dahlstrom and by Taulbee and Sis- 
son. The implications of this finding for 
rapid screening in psychiatric settings 
should be obvious. 

Table 5 summarizes an analysis of the 
High Point Rules themselves, Included in 
Table 5 is the percentage of the total cross- 
validation sample (N — 861) for each of 
the eight possible high codes, as well as 
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the diagnostie accuracy associated with 
each high point. While only 2% of the total 
population of psychiatrie patients had 
MMPI profiles with peaks on Pa, 94% of 
this subgroup were diagnosed psychotic, 
One-third of the total population had high 
codes on D, and 64% of this subgroup 
were diagnosed neurotic. Almost 11% of the 
population had high points on Pt, but these 
patients were fairly evenly split between 
the two diagnostic categories. The last 
column in Table 5 lists the “contribution” 
of each high point, where contribution is 
defined as the proportion of total accuracy 
contributed by each component (ie., the 
accuracy percentage times the percentage 
of the sample diagnosed). Because of the 
small number of cases with peaks on Hy, 
Pa, and Ma, these three components con- 
tribute relatively little to the overall diag- 
nostic accuracy achieved by the High Point 
Rules. 

Table 6 presents a similar analysis for 
each of the 16 Meehl-Dahlstrom Rules, 
The data summarized in Table 6 come 
from the total cross-validation sample 
(N = 861). The second column of Table 6 
lists the percentage of the sample which 
was classified by each rule; these percent- 


TABLE 6 
Diaenostic Accuracy or EacH MEEHL-DAHLSTROM RULE FOR THE Toran SAMPLE (№ = 861) 


% of 


Accuracy percentages 


Rule ee Hit % 1% Miss % Criterion MeehI-Dahlstrom Cong im 
N P N P N+P 
1 AS 67 33 0 07967 10 100 .002 
2 1.0 82 0 18 0 "go A 82 .008 
3 16.0 44 36 20 47 42 КОЛОТ. е 69 -070 
4 8.6 49 38 13 20 68 ба 701,78 042 
5 3.5 90 0 10 0 100 90 90 .032 
6 2.6 68 0 32 0 100 68 68 .018 
7 7.9 25 54 21 10 36 60 54 55 .020 
8 5 100 0 0 0 100 100 100 -005 
9 6 80 0 20 0 80 80 80 .005 
10 4.5 82 0 18 0 82 82 82 .037 
11 5 75 0 25 ОУБ 75 75 .004 
12 9.6 78 8 13 93 0 86 86 .075 
13 13.5 69 9 22 92 0 76 76 .093 
14 13.6 26 62 12 36 7 7b  '88' | 68 :035 
15 11,3 44 27 29 54 30 G3 a db: wis G1 .050 
16 5.7 14 76 10 B. (9f Bu akil . 58 .008 
Total 100. 51 31 18 54 47 БТБ. 51 


$ 
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ages varied from .3% (Rule 1) to 16% 
(Rule 3). However, in examining these 
figures it is important to bear in mind that 
the rules are sequentially applied and, 
therefore, cases are progressively elimi- 
nated as diagnostie decisions are made. 
The third, fourth, and fifth columns of 
Table 6 list, respectively, the percentage 
of correct, diagnoses, the percentage of pa- 
tients classified "indeterminate," and the 
percentage of incorrect diagnoses. Over all 
16 rules, 51% of the patients were accu- 
rately diagnosed, 1896 were misdiagnosed, 
and 31% were undiagnosed. 

The next two columns in Table 6 list 
the percentage of criterion neuroties and 
criterion psychotics who were correctly 
identified by each rule. Overall, 54% of 
the neuroties and 47% of the psychoties 
were correctly identified. The next three 
columns of Table 6 list the accuracy per- 
centages for patients diagnosed neurotie, 
psychotic, and all patients diagnosed (neu- 
totic plus psychotic), respectively. No 
entry in a column indicates that the par- 
ticular Meehl-Dahlstrom Rule did not so 
classify any patient. Overall, of those pa- 
tients called neurotic, 75% were criterion 
neuroties; of patients called psychotic, 
72% were correctly diagnosed. For all pa- 
tients for whom a determinate diagnosis 
was given, 74% were diagnosed correctly. 
The last column of Table 6 indicates the 
diagnostic contribution of each of the 16 
tules, computed in the same manner as in 
Table 5 (accuracy percentage times per- 
centage of sample). While some of the 
rules (e.g., 1, 8, 9, 11, and 16) seem to 
contribute very little to the effectiveness 
of the overall diagnostic procedure, the 
sequential nature of the Meehl-Dahlstrom 
System makes such a determination more 
difficult than for the High Point Rules. 


MMPI Components 


In an effort to ascertain whether other 
predietive indexes not included in the origi- 
nal 65 rational signs might predict more 
accurately than any of the initial signs, 
additional analyses were carried out. A 
Summary of one of these analyses, based 
on the assumption that the information 
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TABLE 7 


Summary or VALIDITY COEFFICIENTS 
ron MMPI COMPONENTS 


Derivation X-valida- 


Component sample tion sample 


(N = 402) (N = 861) 
L+K -01 10** 
Hs + Hy -24** -17** 
D+ Pt —09 —09** 
Pd + Ma 23** 10** 
Ра + 8с 27** 19** 
L-K 01 04 
Hs — Hy —13°*  -02 
D — Pt —23%. 14 
Pd — Ma 03 01 
Pa — Sc 01 —09** 
IL-K]| —07 02 
| Hs — Ну | -11* —04 
D —.Pt —05 —04 
| Pd — Ma| 13** 02 
| Pa — Sc | 00 13** 
(L + К) — (Hs + Hy) е 21°° 
(L + K) — (D + Pt) 05 14** 
(L + К) — (Pd 4- Ma) —15** —04 
(L + K) — (Pa + 8c) —19** —06 
(Hs + Hy) — (D + Р!) к = —07* 
(Hs + Ну) — (Pd + Ma)  -34* -27% 
(Hs + Hy) — (Pa + Sc) —43** —34** 
(D + Pt) — (Pd + Ma) —23°* —20** 
(D + Pt) — (Pa + Sc) —35** — —gi** 
(Pd + Ma) — (Pa + Sc) —10* —04 
| (L + K) — (Hs + Hy) | —18** —10** 
| (L + K) — (D+ Р!) | —05 —07* 
| (L + K) — (Pd + Ma) | 02 07* 
| (L + K) — (Ра + 8с) | 11* ne 
| (Hs + Hy) — (D + Р) | | —08 —03 
| (Hs + Hy) — (Pd + Ма) | —17** —14** 
| (Hs + Hy) — (Ра + 8с) | —07 =, 
| (D + Pt) — (Pd + Ma) | —06 0g 
| (D + P) — (Pa + Sc)| | —03 ess 
| (Pd + Ma) — (Ра + S) | 15** n 
* p < .05. 
**» < .01. 


contained in the 11 MMPI scales might 
be reduced to a more reliable subset of five 
basie components, is presented in Table 7. 
The first component (L + К) combined the 
two MMPI scales constructed to assess 
test-taking defensiveness. The second com- 
ponent (Hs + Hy) combined the two 
highly correlated Neurotic Triad scales 
which differentiate patients with psycho- 
somatic symptomatology. The third com- 
ponent (D + Pt) included the two MMPI 
“mood” scales. The fourth component 
(Pd + Ma) was formed from the two 
“characterological” MMPI scales. The 
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fifth component (Pa + Sc) combined 
the remaining two Psychotic Triad scales. 
The validity coefficients for each of these 
five components are listed in Table 7, both 
for the derivation sample (N = 402) and 
for the total cross-validation sample (N = 
861). In general, the components were not 
highly predictive for this diagnostic task, 
their average validity being quite similar 
to that of the single scales. 

The validity coefficients for (a) the 
arithmetic difference between elements of 
each component and (b) the absolute dif- 
ferences between component elements are 
also listed in Table 7. For both types of 
differences, the validity coefficients were 
quite low. Table 7 also lists the validity 
coefficients for the arithmetic difference 
between each pair of MMPI components, 
followed by validity coefficients for the 
absolute values of these differences. Note 
that the validities of the absolute differ- 
ences between pairs of components were 
small, while the validity of one of the 
arithmetic differences (Hs + Hy) 
(Pa + Sc) was equal to that of the far 
more complicated Taulbee-Sisson Signs 
(Table 2). As Table 2 also indicates, how- 
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ever, the validity of the even simpler sign, 
Pt — Sc, was approximately as high. 


Scale Scores vs. Scale Ranks 


In light of the diagnostic accuracy 
achieved by the simple High Point Rules 
(which rely only on the highest ranking 
clinical scale regardless of its actual eleva- 
tion), a comparison of validity coefficients 
for scale scores vs. scale ranks was carried 
out. Each of the eight available clinical 
scales was ranked for each profile, the left- 
hand seale on the standard profile sheet 
being given the higher rank in cases of 
exact ties, Each of the scale ranks was then 
correlated with criterion diagnoses for the 
derivation sample and for the total cross- 
validation sample. Table 8 summarizes this 
analysis. In general, seale ranks achieved 
higher validity coefficients than scale scores, 
indieating that the elevation component of 
the profile was not very important for this 
diagnostic problem. The rank of Sc achieved 
a validity coefficient (over all of the cross- 
validation samples) of .31, making it a 
more valid sign than all but 6 of the 65 
signs utilized in the initial analyses (Table 
2). The rank of Sc, in fact, turned out to 


TABLE 8 
A Comparison or VALIDITY COEFFICIENTS FOR SCALE Scores vs. SCALE RANKS 
Derivation sample X-validation sample 
(N = 402) (N = 861) 

Scores Ranks® Scores Ranks® 

Hs —23 —31 —14 —19 
D —14 —25 —13 —21 
Ну —20 —30 ~18 —26 
Pd 20 21 16 16 
Pa 27 29 17 19 
Pt 025 —01> —04> -1 
E 20 27 19 31 
17 16 15 12 

(Hs + Hy) — (Pa + Sc) 43 41 34 35 
(Hs + Hy)/(Pa + Sc) 42 38 34 33 
(Hs-Hy)/(Pa-Sc) 34 32 29 27 
Hs + D + Hy + Pt —22 —42 —15 —88 
Pd + Ра + Sc + Ma 19 42 12 38 
(Hs + D + Hy + Pt) — (Pd + Pa + Sc + Ma) —42 —42 --81 —38 
(Hs + D + Hy + Pt)/(Pd + Ра + Sc + Ма) —44 —40 —30 —86 

duh ERER gm 1 1 1 1 1 

(RDF +) (uA iE) -— ш Bue ats 


* Signs of correlation coefficients are reversed to 
Scores. ` 


allow direct comparison with coefficients for scale 


b Not significantly different from zero; all other correlations, p < 01. 
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be a more effective predictor than the aver- 
age diagnostician. 

Since the MMPI components analysis 
indicated that the index, (Hs + Hy) — 
(Pa + Sc), was a fairly effective pre- 
dictor, the validity of this sign was com- 
pared for scores vs. ranks As Table 8 
indicates, scale ranks did not increase the 
validity of this sign, nor those of the closely 
related indexes, (Hs + Hy)/(Pa + Sc) 
and (Hs:Hy)/(Pa-Sc). Also listed in 
Table 8 are the validity coefficients for the 
summed ranks of the four neurotic scales 
and the four psychotie scales. For these 
indexes, ranks were superior to scale 
scores, achieving validity coefficients (for 
the total cross-validation sample) higher 
than all initial 65 signs other than the 
Meehl-Dahlstrom Rules. Interestingly, the 
sum of the reciprocals of the ranks of the 
neurotic scales minus the sum of the re- 
ciprocals of the ranks of the psychotic 
scales turned out to be an even more ef- 
fective predictor, achieving a validity co- 
efficient in the cross-validation samples 
approximately equal to that of the Meehl- 
Dahlstrom Rules, The validity of this in- 
dex was as high as that of the best diag- 
nostician. 


Linear Regression Analyses 


While Meehl (1959) concluded that a 
linear combination of scale scores could 
not predict psychoticism vs. neuroticism 
under cross-validation conditions as well 
as the Meehl-Dahlstrom Rules, findings 
from the present study (for example, that 
a linear combination of four scales had a 
cross-validity approximately equal to the 
Meehl-Dahlstrom Rules) led to a more 
comprehensive analysis of the power of 
conventional linear regression methods. 
While it was not feasible to carry out an 
analysis for each possible combination of 
the 65 initial signs, 19 analyses utilizing 
= 


‚ог this index, as well as for the other indexes 

listed in the bottom half of Table 8, scale ranks 
(1-8) were substituted for the original K-corrected 
T scores. Validity coefficients were then computed 
for each index, using scale ranks as elements, and 
Compared with corresponding validity coefficients 
When seale scores were used. 
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differing numbers of variables and sam- 
pling differing kinds of predictors are sum- 
marized in Table 9. In all of these 
analyses, the original Meehl-Dahlstrom 
derivation sample served as the linear re- 
gression derivation sample, and results 
were cross-validated on the total sample 
of 861 patients. 

The analyses summarized in Table 9 
are listed in approximate order of the per- 
centage of variance predicted in the deriva- 
tion sample. In general, as more variables 
were included in the analysis, the greater 
was the non-cross-validated multiple R. 
However, inspection of the cross-validation 
column of Table 9 reveals a curvilinear re- 
lation, between initial multiple R and cross- 
validity. As more variables were included, 
the eross-validated multiple Ёз increased 
(up to an asymptote at about 10 variables) 
and then began to gradually decrease. The 
linear regression analysis including the 11 
single scales yielded a higher cross-vali- 
dated multiple R than all analyses which 
utilized more variables. 

Of the 19 linear regression analyses, 13 
produced  cross-validated multiple Rs 
which were equal to or greater than the 
validity coefficient achieved by the Meehl- 
Dahlstrom Rules (.39). This latter finding 
—in direct contrast to that reported by 
Meehl (1959)—is of great significance 
when one realizes the extent to which 
Meehl's original conclusion has been sub- 
sequently quoted (e.g, Sundberg, 1960). 
'The findings presented in Table 9 must be 
interpreted somewhat cautiously, since no 
third sample was available for additional 
cross-validation. However, when one con- 
siders the size of the two sets of samples 
(N = 1263), the expected shrinkage upon 
further cross-validation should be quite 
low. 

To ascertain the effect of partitioning 
the predictive indexes into 10 intervals (as 
described earlier), three linear regression 
analyses utilizing ungrouped data but 
otherwise identieal to previous analyses 
(ineluding 2, 4, and 11 variables, respec- 
tively) were carried out. In general, as 
'Table 9 indieates, the multiple Es for un- 
grouped data were approximately .01 


18 Lewis R. GOLDBERG 
TABLE 9 
SUMMARY OF 19 LINEAR REGRESSION ANALYSES: MULTIPLE Rs FOR DERIVATION AND 
Cross-VALIDATION SAMPLES 
завые 9 Variables D Жозе 

2 Pt, Sc .27 32 
2* Pt, Sc .28 33 
4 Hs, D, Hy, 8с 41 36 
4* Hs, D, Hy, Sc .43 37 
4 Hs, Pa, Pt, Sc .43 38 
5 L, Hy, Pa, Pt, Sc .44 .43 
5^ Ranks of Hs, D, Hy, Pt, Ma .46 10 
6 Hs, Pa, Pt, Sc, High Point Rules, Low Point Rules 47 .40 
7 Hs, D, Hy, Pd, Pa, Pt, Sc .46 .40 
8 Hs, D, Hy, Pd, Pa, Pt, Sc, Ma .46 .40 
8* Ranks of 8 clinical scales .46 .40 
9 Hs, D, Hy, Pd, Pa, Pt, Sc, High Point Rules, Low Point 

Rules .48 E 
9 Hs, Pa, Pt, Sc, High Point Rules, Low Point Rules, N/P, 

Taulbee-Sisson Signs, Peterson Signs .49 Al 
11 All 11 single scales .47 .42 
115 All 11 single scales .48 .43 
12 Hs, D, Hy, Pd, Pa, Pt, Sc, High Point Rules, Low Point 

Rules, N/P, Taulbee-Sisson Signs, Peterson Signs .50 .40 
15 No. of clinical scales > 70, X, N/P, Hs — Sc, Hy — Pa, 

Pt — Sc, Taulbee-Sisson Signs, Peterson mens, Inter- 

nalization Ratio, Delta, Pa — N, Sc — N, N — P, High 

Point Rules, Low Point Rules .50 .40 
15 N/K, Х/Б, N/C, Hs — Pa, Hs — Sc, Hy — Pd, Hy — Pa, 

Pt — Sc, Taulbee-Sisson Signs, Peterson Signs, Beta, 

Delta, Pa — Ñ, Sc — Ñ, High Point Rules .50 .39 
15 No. of clinical scales > 60, no. of clinical scales > 70, no. of 

clinical scales > 80, К, D, Pd, Pt, Ma, N/ P, N/6, spik- 

ing of Neurotic Triad, Peterson Signs, Internalization 

Ratio, Rank D, High Point Rules .54 .37 


a Ungrouped data; all other analyses utilized grouped data (see text). 


higher than the same values for grouped 
data, indieating that the grouping proce- 
dure had not markedly affected the ob- 
tained results. 

While the cross-validation program uti- 
lized for these analyses did not permit the 
inclusion of more than 15 variables for 
sample sizes this large, another linear re- 
gression program (with no cross-valida- 
tive potential) permitted the determination 
of the maximum possible non-cross-vali- 
dated multiple R from the derivation 
sample. An analysis which included the 11 
MMPI scales, the eight ranks, and most of 
the initial signs other than simple linear 
composites (41 variables) yielded a non- 
cross-validated multiple R of .56. The last 
row in Table 9 reports a cross-validation of 
a linear regression analysis utilizing those 


15 variables from the larger analysis which 
had the highest beta weights in the deriva- 
tion sample. Note that the non-cross- 
validated multiple R of .54 was approxi- 
mately equal to that achieved by the 
larger set of variables, thus giving a fairly 
accurate approximation of the results 
which would be obtained if all predictors 
were utilized. Under these latter condi- 
tions, however, one could expect the cross- 
validated multiple R to decrease appreci- 
ably. 

The sixth row in Table 9 summarizes 4 
linear regression analysis utilizing the five 
MMPI scales with highest beta weights in 
both the derivation and cross-validation 
samples. Upon cross-validation, this com 
bination of scales produced a ‘multiple Е 
as large as that achieved by any other | 
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linear combination. Consequently, a new 
index—the simple linear nonweighted 
composite of these five variables was com- 
puted. The validity coefficient for the new 
index, (L + Pa + Sc) — (Hy + Pt), was 
44 (for the total cross-validation sample), 
making this index the most effective single 
predictor of psychoticism vs. neuroticism 
from the MMPI. For the total cross- 
validation sample (N = 861), the mean 
score of this index for neurotie patients 
was 34.4 (with a standard deviation of 
16.7) ; the mean score of psychotic patients 
was 51.5 (with a standard deviation of 
18.7). The validity coefficient of this simple 
linear composite was higher than that of 
the Meehl-Dahlstrom Rules as well as that 
of any individual diagnostieian. Using a 
cutting score of 45 (scores of less than 45 
called neurotic; scores of 45 or greater 
called psychotic), this index achieved an 
accuracy percentage of 70% (on the 
total cross-validation samples), 3% higher 
than the best indexes listed in Table 3. 
When scores of 40 to 49 were called in- 
determinate (20% of the total sample), the 
accuracy percentage for diagnosed cases 
tose to 74%. This hit-rate for determinant 
cases was the same as that of the Meehl- 
Dahlstrom Rules, which, however, classi- 
fied 31% of the total population as in- 
determinate. Consequently, it is apparent 
that this simple linear composite is a more 
effective predictor for this diagnostic task 
than all diagnosticians and even the for- 
midable Meehl-Dahlstrom Rules. 


Linear vs. Configural Models 


While the limit of linear prediction for 
this task appears to have been reached, it 
18 of course possible that configural anal- 
yses (e.g., analyses allowing for the inclu- 
Sion of eross-produet and exponential 
terms) might permit increased predictive 
Validity. Consequently, a series of addi- 
tional analyses was carried out, pitting the 
Conventional linear model against two 
configura] methods developed by Paul J. 
Hoffman at Oregon Research Institute. Be- 
cause these configural models utilized large 
numbers of predictor elements and thus 
Imposed serious limitations on the storage 
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capacity of computers, it was necessary to 
restrict the size of the derivation and 
cross-validation samples. Therefore, the 
original derivation sample was split into 
two equal halves, odd vs. even, and the 
results from each half were cross-validated 
(a) against the other half and (b) against 
each of the seven cross-validation samples. 
Table 10 presents the results of these 
analyses, Three major models were com- 
pared. The first of these models, the well- 
known linear regression model, combined 
linearly and optimally the 11 single pre- 
dictor scales. The quadratic model per- 
mitted the inclusion of predictor pair 
cross-products as well as squared predic- 
tor scores (e.g, Hs, D, Hs?, D?, and Hs: 
D), generating a total predietor pool of 77 
variables (11  single-seale scores, 11 
squared scores, and 55  cross-product 
terms). Each of the 77 new elements were 
then combined linearly, utilizing a conven- 
tional linear regression analysis; the 
results of this analysis were then cross- 
validated against each of the eight non- 
derivation samples. 

'The third model, a two-stage procedure, 
was more complex. For each of the 55 
possible predietor pairs, X and Y, the 
following eight indexes were generated: X, 
MAUNA YA, OGY, ЖҮ, X3, and X*Y*. 
For each predietor pair, these eight terms 


TABLE 10 
A Comparison OF VALIDITY COEFFICIENTS 
(MULTIPLE Rs) FOR THREE 
REGRESSION MODELS 


i ie Two- 
ү Linear Quadrat 
N model feine Les d 


Derivation sample 201 51 .52 .60 
X-valid samples 
derivation 
(other half) 201 .42 .42 .42 
A 92 43 .39 .40 
B 77 .32 ‚36 ‚41 
€ 103 .43 .43 ‚38 
р 42 .30 32 .20 
E 181 .18 18 .23 
Е 166 .37 38 .84 ' 
а 200 57 58 .56 
Average: 
8 X-valid 1062 ES 41 .40 
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were combined linearly to generate а new 
variable, V, which optimally predicted 
the criterion. The resultant 55 Vs consti- 
tuted the variables for a second linear 
regression analysis, which was then cross- 
validated against each of the eight non- 
derivation samples. 

'The data summarized in Table 10 allow 
a comparison of the three prediction 
models, each replicated for odd vs. even 
halves of the original derivation sample; 
the entries in Table 10 are the average 
multiple Rs for the two replications. The 
first row of Table 10 lists the non-cross- 
validated multiple R for each model, 
while the other rows list the cross-validated 
multiple Rs. The configural models tended 
to include more of the variance from the 
derivation sample than did the linear 
model, but upon cross-validation the oppo- 
site trend appeared. In general, the linear 
model achieved cross-validities as high as 
either of the two configural models. That is, 
a simple linear combination of MMPI 
scores maintained cross-validities as high 
or higher than either of the configural 
models, 

An interesting additional finding from 
these analyses was the great variation 
among samples in cross-validated multiple 
Rs. Samples D and E (the Chillicothe 
Veterans Administration and Kaiser 
Foundation samples) were relatively poorly 
predicted by all models. On the other hand, 
for Sample G (Rosen’s Minneapolis Vet- 
erans Administration sample composed of 
“pure” cases where the diagnoses may 
have been badly contaminated) the cross- 
validated multiple Rs were appreciably 
higher than for the other half of the origi- 
nal derivation sample! In general, the rela- 
tive predictability of each of the cross- 
validation samples was approximately the 
same as that reported in Table 2. 


Actuarial Tables 


Lykken and Rose (1963) have argued 
that the most useful prediction model 
may be the venerable actuarial table, which 
they have relabeled the disjunctive tech- 
nique. Any pair of continuous variables 
partitioned into two or more intervals can 


generate a two-dimensional actuarial table 
for which the criterion frequencies within 
each cell can be discovered. Lykken and 
Rose report an application of the disjune- 
tive technique using pairs of MMPI 
seales, each scale partitioned into three 
intervals (e.g., nine-cell actuarial tables). 
Using one such table with the same diag- 
nostic problem and the same data utilized 
in the present study, they achieved cross- 
validated accuracy percentages of 64% 
(when all patients were diagnosed) and 
70% (when 31% of the patients were classi- 
fied indeterminate). While the present 
study has demonstrated that the linear 
combination, L + Pa + Sc — Hy — Pt, 
is a more valid predictor for this diagnostic 
task than their best actuarial table, it is 
certainly reasonable that tables other than 
those Lykken and Rose investigated might 
support their theoretical argument. For 
this reason, a series of disjunctive studies 
was carried out. 

One actuarial table was constructed 
from the eight high-point and the eight 
low-point codes. The cross-validated ac- 
curacy percentage for this 64-cell table 
was not significantly higher than for the 
High Point Rules themselves, indicating 
that MMPI low points contributed rela- 
tively little to predictive accuracy for this 
diagnostic task. As a further investigation 
of the power of the disjunctive technique, 
two-scale high points and two-scale low 
points were combined into an actuarial 
table, again with no gain in cross-validity 
over the two high points alone. Another 
actuarial table was constructed by using 
the high points and the number of clinical 
scales with 7 scores = 70, again with no 
significant increase in diagnostic accuracy 
over the simple High Point Rules. Since 
Sc — N proved to be a valid predictor, 48 
had the High Point Rules, an actuarial 
table using the eight high points against 
Sc — N (partitioned into 10 intervals) was 
also constructed, again with no significant 
gain in oross-validity. Attempts were also 
made to examine three-point, four-point, 
and five-point rules, but the number 0 
empty cells in either the derivation OF 
cross-validation samples (but not bot 
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led to the rejection of this technique. An- 
other approach, dichotomizing each of the 
eight clinical scales at T = 70, yielded an 
actuarial table of 2% or 256 cells, of which 
profiles from the derivation sample fell 
into 142 cells. While the non-cross-vali- 
dated accuracy percentage for this proce- 
dure was 78%, cross-validation again proved 
unsuccessful. Finally, the ranks of selected 
sets of clinical scales (e.g., Hy, Pa, and Sc; 
Hs, D, Pd, and Ma; Hy, Pa, Pt, and Sc; Hs, 
Ну, Pa, and Sc) were used to construct 
multidimensional actuarial tables, again 
with no significant increase in eross-validity 
over far simpler procedures. 

In Lykken and Rose's (1963) discussion 
of the power of actuarial tables, the follow- 
ing hypothesis was advanced: 


Psychological test scores of the sort usually 
employed as the independent variables in psycho- 
logical prediction are notoriously unreliable.... 
Treating such unreliable observed scores as continu- 
ous variables, a practice followed by all of the 
conventional prediction methods, results in the 
achievement of a spurious precision. Since only a 
portion of the total observed score variance can 
in principle be lawfully related to any criterion, a 
method which allows the prediction equation to 
be determined in part by the unreliable component 
of the variance of the predictors may yield a less 
than optimum cross-validity. We are now investi- 
gating the (radical) hypothesis that any unreliable 
independent variable will yield higher predictive 
validity when relatively coarsely partitioned than 
when treated as continuous [italics added] and that 
the optimum number of intervals into which each 
predictor should be partitioned decreases with de- 
creasing reliability [Lykken & Rose, 1963, p. 140]. 


. Tests of this hypothesis were available 
in the present study, since most of the 
signs had been partitioned into 10 inter- 
vals. As a test of Lykken and Rose's hy- 
pothesis, one sign (Sc — Neurotic Triad) 
Was correlated with criterion diagnoses in 
unpartitioned form and the resulting valid- 
lty coefficients compared with those re- 
ported in Table 2, While the differences 
between the grouped and ungrouped valid- 
ity coefficients were negligible (thus not 
calling into question the grouping proce- 
dure used in this study), these differences 
Were never in the direction predicted by 
Lykken and Rose. The differences ranged 
from .00 to .05 (the latter in Sample D), 


the ungrouped correlations always being at 
least as high as the grouped correlations, The 
overall validity coefficients for the 861 cases 
were .36 (grouped) vs. .38 (ungrouped). It 
is important to realize that while this find- 
ing supports the argument that grouping 
data may not greatly decrease validity co- 
efficients (especially when the number of 
partitions is as large as it was in this study), 
it provides evidence against the assertion 
that “.... any unreliable independent vari- 
able will yield higher predictive validity 
when relatively coarsely partitioned than 
when treated as continuous. . . .” Other nega- 
tive evidence bearing on the Lykken and 
Rose hypothesis has already been discussed 
in the section reporting results of the linear 
regression analyses. 


Problem of Criterion Contamination 


The validity coefficients and accuracy 
percentages reported so far may have been 
inflated by the contamination of criterion 
diagnoses with MMPI interpretations. A 
comparison of findings from the most con- 
taminated samples with those from the 
least contaminated samples should help 
clarify the effects of such criterion con- 
tamination. While Meehl and Dahlstrom 
(19605) classified Sample A (Peterson's 
subclinical schizophrenia sample) as un- 
contaminated, the diagnoses which initially 
classified patients for Peterson’s (1954a) 
study may have been quite highly con- 
taminated; consequently, the degree of 
contamination for this sample might best 
be described as unknown.’ If Sample A is 


*'The procedure for criterion classification for 
Sample A differed in many important respects from 
that employed for the other six cross-validation 
samples, and it is correspondingly more difficult 
to evaluate the possibilities for criterion contami- 
nation in this sample than for the others. The 
“neurotic” group of patients were so classified both 
as a function of original diagnoses (in which an 
unknown amount of MMPI contamination was 
possible) and as a function of having no subse- 
quent history of psychiatric hospitalization in 
Minnesota. No patients were included as neurotic 
if they had been initially diagnosed schizophrenic 
(or “latent,” “incipient,” or “subclinical” schizo- 
phrenia) even though they did not show any sub- 
sequent need for hospitalization as psychiatric 
patients. Consequently, the criterion classification 
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therefore disregarded and the remaining 
three most contaminated samples (G, C, 
and F) are compared with the three least 
contaminated samples (E, B, and D), 
great differences can be seen between the 
two sets of samples in the predictability 
of their diagnoses from the MMPI. For 
example, the average judge had a validity 
coefficient of .31 for the average of the more 
contaminated samples, but only .18 for 
the average of the less contaminated 
samples (this difference being statistically 
significant, p < .05). For the average of 
the more contaminated samples, 20 of the 
43 initially valid signs had validity coeffi- 
cients > .30, while for the average of the 
less contaminated samples only 10 signs 
met this criterion. . 

In order to discover the extent to which 
eriterion contamination influenced the rela- 
tive validity of predictive indexes, validity 
coefficients were computed separately for 
the composite less contaminated sample 
(N = 300) and for the composite more 
contaminated sample (N — 469); Sample 
A was excluded from these analyses? 
Table 11 presents these validity coefficients 
for the 11 scales, the 8 ranks, and the 12 
signs which had relatively high validity in 
previous analyses for any of the cross- 
validation samples. As Table 11 indicates, 
the problem of criterion contamination is 
a highly significant one. The MMPI scales 
F and Se (with virtually zero validity in 
the less contaminated sample) each had 
validity coefficients of .27 in the more con- 
taminated sample, suggesting that when 
an MMPI was available, patients with 
high F or Sc scores were more likely to be 


for this sample is a mixture of concurrent psychi- 
atric diagnosis (for the “neurotics”) and subse- 
quent psychiatric hospitalization (for the 
“psychotics”), For these reasons, it seems wise to 
consider the degree of contamination as “unknown.” 

° However, the composite less contaminated 
sample no doubt differs from the composite more 
contaminated sample in ways other than the de- 
gree of criterion contamination. For example, the 
base rate of psychotic diagnoses is 51% in the less 
contaminated sample as compared to 41% in the 
more contaminated sample. Moreover, as Appendix 
A indicates, there are some differences in mean 
scale scores between the two samples; mean scores 
on Hs, D, Hy, Pt, and Sc are higher in the more 
contaminated sample, 


diagnosed psychotic (although the scales 
themselves bear no relationship to the 
eriterion). Similarly, Pa had no validity 
in the less contaminated sample, but had a 
validity coefficient of .23 in the more con- 


TABLE 11 


VALIDITY COEFFICIENTS FOR Less CONTAMINATED 
vs. Мове CONTAMINATED SAMPLES 


2 Моге 
Legon сок 
ated pss 
saple “ара 
300) p 
Scales 
L 13 1 
NI —04 27 
K 04 08 
Hs —11 —19 
D —22 —11 
Ну —18 —19 
Pd 06 19 
Dust 04 23 
put —19 04 
Bote 00 27 
Ma 15 12 
Ranks 
Hars 03 29 
D 25 22 
Hy 17 320 @ 
Pd —16 —17 
Ра —17 —21 
Рі 24 00 
Sc** —17 —37 
Ma —19 —09 
Signs 
N-p* -12 -4 
N —16 —18 
Low Point Rules -16  -2 
Sc — N** 21 43 
D — Pd —96 —21 
Taulbee-Sisson Signs* -27; —40 
(o) ox 30 17 
High Point Rules 32 30 
Ranks: Hs + D + Hy + Pt 35 41 
L + Ра + Sc — Hy — Pt* 35 48 
Meehl-Dahlstrom Rules* 29 42 
Ranks: 


1 1 1 1 
1 1 1 1 
ШЕ ne 37 —39 
(5 is Pa : Sc ha ж) 


* Differences in validity coefficients between 
samples significant, p < .05. + 
** Differences in validity coefficients between | 


samples significant, р < .01. 


٤ 
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taminated sample. Рё, on the other hand, 
had a validity coefficient of —.19 for the 
less contaminated sample while its validity 
virtually disappeared for the more con- 
taminated sample. Similar findings oc- 
curred among the eight ranks; the ranks of 
Hs and Sc were substantially more valid 
for the more contaminated than for the 
less contaminated sample, while the rank 
of Pt was more valid for the less con- 
taminated than the more contaminated 
sample. 

Inspection of the validity coefficients for 
the 12 signs listed in Table 11 reveals 
additional evidence of the effects of crite- 
rion contamination. For example, the va- 
lidity of the Taulbee-Sisson Signs in the 
less contaminated sample (.27) rose to .40 
in the more contaminated sample. Even 
more ie were the results for Sc — 
N and N — P, where validity coefficients 
changed from .12 and .21 in the less con- 
taminated sample to .41 and .43 in the 
more contaminated sample. Interestingly, 
the Meehl-Dahlstrom Rules produced va- 
lidity coefficients of .29 in the less con- 
taminated sample against .42 in the more 
contaminated sample. More surprisingly, the 
new index, L + Pa + 8с — Hy — Pt, 
also showed a significant increase in valid- 
ity in the more contaminated as compared 
to the less contaminated sample. In the 
other direction, the elevation of Pd + Ma 
relative to the overall elevation of the 


' profile (G/X) had a validity of .30 in the 


less contaminated sample which fell to .17 
in the more contaminated sample. 

The rank correlation (rho) between 
samples for the 12 signs listed in Table 11 
is .39; for the eight ranks, the correspond- 
ing tho i is —.33; while for the 11 scales the 
rank correlation drops to —.59! Since the 
relative validity of predictive indexes has 
been shown to be markedly affected by 
Criterion contamination, the most valid 
Signs for new uncontaminated samples 
might well be found from analyses of the 
300 subjects in the less contaminated sub- 
Sample rather than the 861 subjects in the 
total cross-validation sample. For this 
reason, additional analyses were carried 
out on the less contaminated sample alone, 


utilizing signs previously considered as 
well as some new indexes, 

A linear regression analysis utilizing nine 
variables (L, F, D, Hy, Pa, Pt, Sc, num- 
ber of clinical scales 2 70, and the High 
Point Rules) achieved a non-eross-vali- 
dated multiple R of .46 in the less con- 
taminated sample, and a number of 
weighted linear composites of sets of these 
nine variables yielded validity coefficients 
around .40. One of the best indexes for 
differentiating psychotics from  neurotics 
in the less contaminated sample was the 
simple unweighted sum of the ranks of D 
+ Hy + Pt (which achieved a validity 
coefficient of .39). Using a cutting score 
either between 10 and 11 or between 12 
and 13 (low scores called neurotic and 
high scores called psychotic), this index 
diagnosed 67% of the 300 patients ac- 
curately. When scores of 11 and 12 were 
called “indeterminate” (16% of the sam- 
ple), the accuracy percentage for the diag- 
nosed cases rose to 70%. This new index 
was more valid than the Meehl-Dahl- 
strom Rules (.35) as well as the best diag- 
nostician (.28) for the less contaminated 
sample. In clinical settings, whether one 
chooses to use this index of scale ranks 
(with its non-cross-validated coefficient of 
39 on 300 relatively uncontaminated 
cases and .34 on the total sample) or the 
composite of scale scores, L + Pa + Se — 
Hy — Pt (with its non-cross-validated 
coefficient of .44 on the larger sample of 
861 cases and .35 on the less contaminated 
subsample), will depend upon the reader’s 
estimate of the significance of the problem 
of criterion contamination. A more defini- 
tive choice between both indexes must 
await their comparative cross-validation 
in new settings where the diagnostic crite- 
rion has not been contaminated by MMPI 
interpretations. 


Discussion 


The finding that the average diagnosti- 
cian’s predictive powers for this diagnostic 
problem can be surpassed by utilizing any 
of a number of relatively simple actuarial 
indexes should come as no surprise to any- 
one familiar with the growing literature 
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comparing clinical and statistical predic- 
tions. Moreover, the fact that staff judges 
and trainees achieved the same degree of 
accuracy for this diagnostic task corrobo- 
rates other studies (e.g., Goldberg, 1959) 
in indicating that clinical experience is no 
guarantee of predictive acumen. It seems 
reasonable to assume that the typical 
clinical setting does not provide systematic 
feedback of the predictor-outcome con- 
tingencies necessary for optimizing a 
judge’s accuracy. 

The present study demonstrated that 
linear composites of single-scale scores 
(such as L + Pa + Sc — Hy — Pt) out- 
performed all diagnosticians. Since con- 
figural actuarial techniques did not display 
greater relative validity than linear indexes 
for this diagnostic problem, it seems quite 
possible that Meehl chose the wrong task 
for testing the clinician’s purported ability 
to utilize complex configural relationships. 
While the clinician might be able to 
vanquish the linear regression equation in 
some other more configural arena, there 
seems no reason to assume that configural 
actuarial techniques would not, in turn, be 
able to outperform the clinician. 

On the other hand, the present study 
provides evidence that configural actuarial 
models (while extracting more variance 
from derivation samples than - linear 
models) result in greater shrinkage upon 
cross-validation, a finding which has been 
reported in the psychometric literature for 
some time (e.g., Ward, 1954). While psy- 
chologists certainly should not give up the 
search for more powerful configural tech- 
niques, they are likely to need sample 
sizes as much as 100 times as large as are 
found in most studies and at least 10 
times as large as the total sample analyzed 
in the present study. The problem of sample 
size is the Achilles heel in Lykken and Rose’s 
(1963) proposal for the use of actuarial ta- 
bles. Even relatively simple actuarial tables 
involve large numbers of cells, and the un- 
equal distribution of persons to cells makes 
stable population estimates within each cell 
extremely difficult when the sample is not 
very large. Perhaps the great power of lin- 
ear regression methods lies in their ability 


to utilize the small samples typically avail- 
able. 

The validity of even the best of the 
signs reported in the present study may 
provide no reassuring balm for MMPI 
users. While most evidence (e.g., Little & 
Schneidman, 1959) would indicate that 
other techniques available at the present 
time—projective or objective—would do 
no more valid a job for this diagnostic 
problem than the MMPI, maximum hit- 
rates around 70% are nonetheless dis- 
couraging. On the other hand, any consid- 
eration of degree of validity is necessarily 
clouded by issues related to the nature of 
the criterion to be predicted. In the case 
of diagnosing psychotic from neurotic pa- 
tients, an important issue concerns the 
reliability of the criterion classifications. 
If the reliability of the criterion is actu- 
ally quite low, then the present study may 
have uncovered indexes near the asymp- 
tote of predictive validity. More impor- 
tantly, some of the indexes discovered in 
the present study could ultimately have 
more prognostic significance than the psy- 
chiatric diagnoses upon which they were 


originally based. For a discussion of this | 


possibility, typically referred to as the 
“bootstraps” effect, see Cronbach and 
Meehl (1955). 

Evidence regarding the reliability of 
psychiatrie diagnoses (e.g, Ash, 1949; 
Foulds, 1955; Mehlman, 1952; Schmidt & 
Fonda, 1956; Seeman, 1953) has led to 
varying interpretations, in part because 
these studies differed in their experimental 
designs and in the psychiatric diagnoses 
considered. Studies which indieate that in- 
dividual psychiatrists tend to give differ- 
ent distributions of diagnoses within the 
differing patient populations with which 
they work—as well as studies indicating 
that individual clinical psychologists differ 
in their agreement with a particular psy- 
chiatrist—do not furnish direct evidence 
to the reliability of the diagnoses ascribed 
by a typical psychiatrie team. To the au- 
thor’s knowledge, there has never been ? 
study comparing diagnoses made by tw? 
or more psychiatrie teams on the same 
patients at the same point in time. А 

One study which approximated this 


ese 
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model experimental design, however, was 
carried out by Hunt, Wittson, and Hunt 
(1953) on naval personnel. Analyzing the 
records of 681 neuropsychiatric patients 
who had been referred from a precommis- 
sioning station to a naval hospital, these 
investigators found that 54% of the diag- 
noses (when grouped into the three broad 
nosological categories: “neurosis,” “рѕу- 
chosis,” and “personality disorder”) were 
in agreement between the two installations. 
Because of (a) the atypical population of 
patient samples, (b) the lack of reported 
information concerning the characteristics 
of the two psychiatrie teams, (c) the pos- 
sibility of some contamination in diagnos- 
tie judgments between the two installa- 
tions, and (d) the incomplete reporting of 
diagnostic errors, this study does not sup- 
ply very firm evidence regarding the re- 
liability of psychiatrie diagnoses. While 
the 5495 agreement percentage is greater 
than one might expect by chance alone 
(approximately 83%), the investigators 
conelude that: *On any absolute evalua- 
tive scale, however, this reliability cer- 
tainly cannot be considered either high or 
satisfactory [Hunt, Wittson, & Hunt, 1953, 
р. 60]." 

Ап important problem related to diag- 
nostic reliability involves the temporal 
stability of the classifications neurotic vs. 
psychotic. Rennie (1953), in a 20-year 
follow-up study of 200 neurotic patients, 
found that approximately 14% of the 
initially diagnosed neurotics were later re- 
classified as psychotic. In an earlier study, 
Masserman and Carmichael (1938) stud- 
ied intensively a smaller group of psychi- 
atric patients of varied diagnoses for at 
least 1 year past their discharge from the 
hospital. Of 24 patients originally given 
Some variety of neurotic diagnosis, 2575 
Were later diagnosed psychotic; on the 
other hand, of 54 patients originally diag- 
hosed psychotic, less than 6% were subse- 
quently diagnosed neurotic. Generalizing 
from these two studies, it seems reasonable 
to assume that the ascription of a psychotic 
diagnosis to a patient may be more 
“accurate” (in terms of temporal stability) 
than the ascription of a neurotic diagnosis. 
If, in fact, there are more diagnostic errors 
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committed on patients called neurotic than 
on those called psychotic, then it could 
be argued that more patients should be 
called psychotic than are presently so diag- 
nosed. On the other hand, if the aseription 
of a psychotic diagnosis typically results 
in a “no treatment” decision for an indi- 
vidual patient, the values of most clinicians 
—and probably of society in general— 
would lead diagnosticians in the exact 
opposite direction (e.g, to decrease psy- 
chotie diagnoses). 

It should be obvious that the usefulness 
of an index for any particular clinical 
setting is a joint function of the differential 
utilities underlying errors in diagnoses and 
the population base rates of the diagnostic 
categories (Cronbach & Gleser, 1957; 
Cureton, 1957; Dawes, 1962; Meehl & 
Rosen, 1955; Rimm, 1963). It should cer- 
tainly be borne in mind that the results of 
the present study (especially the cutting 
scores for various predictive indexes) can 
be applied to other settings only with 
considerable caution. Especially when the 
base rate of psychotics vs. neurotics varies 
greatly from a 50-50 split, new cutting 
scores must be considered (Rorer, Hoffman, 
La Forge, & Hsieh, 1964). Moreover, some 
indexes which were relatively invalid in 
the present study may be more useful in 
clinical settings in which the base rates are 
particularly extreme. Conversely, some of 
the best indexes discovered in the present 
study may prove worthless in these same 
settings. 

Perhaps of even greater significance, the 
findings from this study have been inter- 
preted so far as if an error of one kind 
(Type I) is equally as important as an 
error of the other kind (Type II). In clini- 
cal practice, however, the utilities of differ- 
ent types of errors for partieular decisions 
are often unequal, and consequently the 
diagnostic sign which is maximally effec- 
tive in eliminating overall errors may not 
be most useful in any particular applied 
setting. In order to maximize the utility of 
decisions made by any predictive index, it 
is necessary to adjust the cutting scores 
for the index so as to optimize the resulting 
decision payoff matrix (Cronbach & Gleser, 
1957). Since utility functions may be ex- 
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pected to vary from one setting to another, 
the present study—utilizing cutting scores 
set so as to optimize overall accuracy in 
the derivation sample (e.g, Туре I and 
Type II errors being weighted equally)— 
provides no exact determination of the 
best index in settings where the utilities for 
the two types of errors are not equal. 

Any evaluation of the usefulness of a 
diagnostie index must consider the cireum- 
stances under which the index is to be used. 
Specifically, the actual usefulness of any 
technique can only be assessed through dis- 
covery of its incremental validity (Se- 
chrest, 1963) over information already 
available. In diagnosing psychotic from 
neurotic patients, at least a modest amount 
of diagnostic information is typically 
available from intake records (e.g., admis- 
sion complaints, frequency and duration of 
past hospitalizations, previous diagnoses, 
еёс.). It may well be that indexes which 
differentiate with reasonable validity over 
the entire range of psychoticism to neuroti- 
cism may be useless for diagnosing “bor- 
derline” patients. The incremental valid- 
ity of all MMPI indexes must ultimately 
be ascertained for those cases where the 
diagnostic decision is still unclear after 
evaluation of previously available infor- 
mation (e.g, for “doubtful” or “mixed” 
initial diagnoses). Since intake informa- 
tion was not available for the present 
study, no evaluation of the incremental 
validity of the MMPI predictive indexes 
could be carried out. Parenthetically, it is 
important to remember that indexes which 


permit an “indeterminate” diagnosis (as 
do the Meehl-Dahlstrom Rules) may tum 
out to be the least useful, since it could be 
that it is for precisely these indeterminate 
cases that MMPI information was sought 
in the first place (Meehl & Dahlstrom, 
1960). 

Finally, in evaluating the effectiveness 
of the MMPI for this differential diagnos- 
tic problem, one must remember that the 
only data available for the present study 
were scores on 11 MMPI scales. There are 
by now well over 200 other MMPI scales 
(Dahlstrom & Welsh, 1960), and it would 
not be surprising if some of these scales 
—as well as some combinations of these 
seales—could improve on the validity of 
indexes analyzed in this study. Moreover, 
because the original answer sheets were not 
available for the present study, no analyses 
could be carried out at the item level. It 
is not unreasonable to suppose that an 
empirically constructed scale, made up 
from criterion groups of neurotics and 
psychotics, respectively, might cross-vali- 
date with coefficients greater than those of 
any index analyzed in the present study. 
Consequently, while the present study 
should furnish considerable evidence for 
the relative validity of elements from the 
standard MMPI profile (for this differen- 
tial diagnostic problem with male psychi- 
atric patients), the study must not be 
construed as providing evidence for an 
asymptote of predictive validity from the 
total MMPI item pool. 
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APPENDIX 


TABLE Al 
Means (M) AND STANDARD DEVIATIONS (с) OF THE ELEVEN MMPI Scares (K-ConREcTED 
T Scores) IN ALL SAMPLES 


Ma 
Derivation M 53 59 54 69 75 67 64 62 70 70 56 
(М = 402) с 5 9 10 17 19 13 13 12 17 17 12 
А M |53 62 52 71 75 70 65 62 72 74 60 
(N = 92) с 6 11 9 15 17 ll 14 13 16 20 13 
B M |53 [58 53 66 70 65 67 59 65 67 59 
(N = 77) c 5 8 9 17 17 12 12 12 15 15 12 
с M 52 |59 54 71 83 71 68 65 78 74 57 
(N = 103) c 4 8 9 17 19 13 12 12 17 16 12 
D M |. (56 '|56 Бо 66 59 б 55 63 (64 58 
(М = 42) с 7 8 9 15 19 11 11 11 16 13 12 
E M M9 |8 54 60 70 63 68 58 65 64 56 
(М = 181) c 7 8 9 15 16 -11 11 10 14 18 12 
Е M |51 (59 53 67 77 68 67 6l 75 72 57 
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ANXIETY REDUCTION, AND SUSCEPTIBILITY TO 
EMOTIONAL CONTAGION' 


KENNETH RING, С. E. LIPINSKI, лхо DOROTHEA BRAGINSKY 


University of Connecticut 
A 2 X 2 X 2 factorial experiment was performed to test the following hypothe- 
sis: lst-born individuals have a greater need for both selí-evaluation and 
anxiety reduction than later borns. The experimentally manipulated varia- 
bles were: (a) opportunity for anxiety reduction and (b) ease of self-evalua- 
tion, 64 female undergraduates served as Ss (with an additional 32 in control 
conditions), of which half were first born, half later born. The experiment 
provided strong evidence that the need for self-evaluation was greater for Ist 
borns but suggested that later borns have a greater need for anxiety reduc- 
tion. A developmental theory which attempts to integrate findings relating 
to birth order, motivation, and social behavior was presented. 
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THE RELATIONSHIP OF BIRTH ORDER TO SELF-EVALUATION, 


ver since Schachter (1959) reported Concerned primarily with the determinants 
some surprisingly strong relationships of affiliative behavior under conditions of 


between ordinal position of birth and affilia- experimentally induced anxiety. He con- 
tive behavior, we have witnessed a re- luded, after a series of laboratory studies, 
surgence of ийге in the correlates of that there were two principal motivational 
birth order (eg, Becker & Carroll, 1962; factors to which this kind of behavior could 


у 5 Ty ү . be ascribed, viz., the need for direct anxiety 
ons o ТР ku end reduction and the need for self-evaluation, 
Firestone, & Grinker, 1963; Sampson, 1962; Concerning the former, Schachter (1959) 
Sarnoff & Zimbardo, 1961; Schachter 1963, comments: 

1964; Smart, 1963; Staples & Walters, People do serve a direct anxiety-reducing function 
1961; Zimbardo & Formica, 1963). While for one another. They comfort and support, they 
many of these studies have certainly been reassure one another and attempt to bolster cour- 


nOD ен : age...it is possible that highly anxious subjects 
intriguing, there characteristically is no chose “Together” as a means toward this sort of 


more than a perfunctory attempt to formu- social reassurance and toward reducing anxiety 
late an ad hoe “theory” to “account for" fp, 26]. 

the results. Thoughtful conceptual work has i T 

been rare (there are, of course, exceptions As for self-evaluation, Schachter (1959) 
to this generalization, e.g., Zimbardo & writes: 

шш; 1968) E ee deco One may use other people to evaluate his emotions 
н о. ан IE and feelings. Ina novel, emotion-producing situa- 
ря puons ога ne tion, unless the situation is completely clear-cut 
birth order to certain psychological апі the feelings one experiences or “should” experience 
Social variables. In this section, we shall be may not be easily interpretable, and it may 


i ; i f social interaction and com- 

concern: assumptions and require some degree o [ 7 
ће y only with ma th nd based. Parison to appropriately label and identify a feel- 
evidence on which they а ` ing. We are suggesting, of course, that the emo- 


After the data have been considered, we tions are highly susceptible to social influence and 


shall spell out the theory in detail. that...a need for social evaluation of the emo- 
In his monograph, Schachter (1959) was tions may be active [p. 26]. 
mnc c i" . 


йл mE Schachter’s work has, of course, gener- 
This research was supported by a University ated criticism as well as research. For ex- 


onnecti i t (No. 94) 3 i 
to ihe nines a pee ample, Rabbie (1963) has questioned the 
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importance of certain motivations which 
Schachter had, to his satisfaction, ruled 
out; Gerard and Rabbie (1961) have ex- 
pressed reservations concerning the meth- 
ods used to demonstrate the presence of a 
need for self-evaluation; and Zimbardo and 
Formica (1963) have quarreled with 
Schachter’s theorizing about his ordinal 
position effects. In view of such critical 
commentary, it is surprising to learn that 
the principal experimental study which 
seems to furnish the only positive evidence 
for Schachter’s final conclusion, one con- 
ducted by Wrightsman (1960), has yet to 
be carefully scrutinized. Since this investi- 
gation is the starting point for the research 
reported below, it will be helpful to review 
Wrightsman’s experiment at some length. 

Wrightsman created three experimental 
conditions in each of which subjects were 
exposed to an anxiety-arousing manipula- 
tion. In the “alone” conditions, Ss (sub- 
jects) filled out a questionnaire item re- 
lating to felt anxiety immediately after 
receiving the manipulation, waited by 
themselves for 5 minutes, then again re- 
sponded to the same questionnaire item. In 
the "together-no talk” condition, after the 
prewaiting period anxiety measure, Ss were 
brought together in groups of four for an 
interval of 5 minutes after which they again 
indieated their level of anxiety. The Ss 
in this condition were instructed not to talk 
to one another during the waiting period. 
The "together-talk" condition was identical 
except that Ss were encouraged to talk. 

As for the expected findings, Schachter 
(1959), in describing this experiment, 
makes the following argument: 


If the presumed anxiety-reducing property of 
group membership is a potent determinant of the 
choice of "Together" ( le. wanting to affiliate), it 
should be anticipated that being with others will 
actually reduce anxiety [p. 103]. 


This is obviously a plausible, though not a 
logical, assertion. Clearly, simply because 
a need is operative, it does not follow that 
group interaction will necessarily gratify it. 
Gratification of the need, as Schachter 
(1959, p. 112) elsewhere notes, depends on 
the nature of the group interaction, not 


merely on the fact that group interaction 
occurs, 

Ignoring birth-order differences, Wrights- 
man failed to find that anxiety was reduced 
to a greater extent in the together condi- 
tions. However, he was able to show that а 
significantly greater decrement in anxiety 
did oceur for first-born Ss in the together 
conditions, as compared to the alone con- 
dition. Necessarily, in view of the overall 
finding, no such trend was evident for later 
borns. Thus, there are data consistent with 
the hypothesis that the need for anxiety 
reduction is a determinant of affiliation 
under conditions of anxiety arousal, but 
only for first-born individuals. 

Wrightsman’s experiment, however, was 
designed also to provide data concerning 
the need for self-evaluation as a source of 
affiliative behavior. Again, let us quote 
Schachter (1959) on this point: 


If evaluative needs are major determinants of the 
relationship between anxiety and affiliation, it 
should be anticipated that being with others will 
lead to homogeneity of emotional intensity among 
the group members.... This expectation derives 
from the set of assumptions [those of Festinger's, 
1954, social comparison theory]... which indi- 
cate that social evaluation is possible only when 
there is relative homogeneity among the mem- 
bers of a reference group. Assuming evaluative 
needs and granting a situation in which evaluation 
is possible only through social comparison proc- 
esses, it follows from the above that if discrepancies 
exist among group members, pressures will arise to 
reduce such discrepancy [p. 103-104]. 


It may indeed follow, according to 
theory, that where discrepancies exist pres- 
sures will be set up to reduce them, but 
such an assumption does not logically de- 
mand that a group, subject to such pres 
sures, will become more homogeneous 1? 
emotional intensity, as Schachter proposes. 
For that to happen, these pressures toward 
uniformity of emotion would have to exert 
a significant effect on the group, an occul- 
rence which clearly depends, once more, on 
the nature of the group interaction and the 
forees which govern it, A person may be 
hungry, but that doesn’t mean he will eat; 
it means only that he should make at 
tempts to eat. It is clear, we think, that 
Schachter’s arguments concerning bot 
anxiety reduction and self-evaluation suffe! 


—— —M M — HP Vu 
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from the same logical flaw, viz., the hidden 
assumption that the existence of a motiva- 
tion requires its satisfaction, In the present 
context, it is obvious that Schachter would 
not want to conclude, if no tendencies to- 
ward emotional homogenization should be 
observed, that the need for self-evaluation 
was not operative, but merely that the 
conditions of group interaction were not 
such as to gratify it. 

Wrightsman, in fact, did find evidence 
for signifieantly greater homogenization of 
emotional intensity in the together condi- 
tions, thus supporting a self-evaluation 
interpretation. Nevertheless, there remains 
а certain ambiguity concerning some of the 
implications of Wrightsman's data, espe- 
cially in regard to the self-evaluation hy- 
pothesis. 

To take a minor point first, the index of 
homogenization used by Wrightsman con- 
sisted of the following fraction: the range 
of anxiety scores after the 5-minute waiting 
period over the range of scores before this 
period. Any fraction less than 1 would then 
be taken as indicative of increasing uni- 
formity of emotional state. Why a range 
Measure was employed instead of a similar 
табо based on the standard deviation of 
anxiety scores is not clear. Surely the latter 
Would have been preferable since it takes 
into account all the data and not just those 
of the two extreme individuals of each 
group and is, therefore, a more sensitive 
Indicator of emotional homogeneity. 

Another difficulty in interpreting these 
ata as favorable to a self-evaluation inter- 
pretation stems from the finding that the 
index of homogenization is in part artifac- 
tual due to the fact that anxiety generally 
declines over time, Since there is a tendency 
Or Scores to move toward zero—the floor 
of the scale—homogeneity will necessarily 
brought about as a function of general 
‘nxiety-reduction tendencies. Actually, the 
correlations between the index of homo- 
fneity and anxiety reduction were .57 
(р < .02) for the together-no talk condi- 
tion and 31 (n.s.) for the together-talk 
condition, Thus, there is some evidence that 
the index used is by no means a “pure” meas- 
we of self-evaluative motivations. ‘ 

Finally, Wrightsman presents no evi- 


dence concerning the relative influenceabil- 
ity of emotional states of first-born and 
later-born Ss, This is surprising indeed be- 
cause everything in Schachter's monograph 
would suggest that first borns should be 
more responsive and more likely than later 
borns to move toward the perceived modal 
anxiety level of the group. Such a finding 
would have been strong evidence in support 
of greater tendencies toward selí-evaluation 
in first borns and would be highly con- 
sistent with his affiliation data. This point 
is important and deserves some amplifica- 
tion. 

With respect to ordinal position, Schach- 
ter’s principal finding is that first-born 
individuals want to affiliate when afraid; 
later borns do not especially or at least are 
more indifferent about affiliation under 
anxiety-arousing conditions. On the basis 
of Wrightsman’s experiment, Schachter 
asserts that there are two main reasons why 
people affiliate when afraid: they want 
anxiety reduction and/or self-evaluation. 
But since it is largely first borns who want 
to affiliate in the first place, it is necessary 
to show specifically that they have these 
needs, or, at least, that they have stronger 
needs in this regard than later borns. There 
are some confirmatory data concerning 
anxiety reduction, but, as we have indicated, 
there is no direct evidence from Wrights- 
man’s experiment that first borns exceed 
later borns in the need for self-evaluation. 
Thus, a crucial link in Schachter’s argu- 
ment lacks empirical support. It is true 
that certain analyses made by Wrightsman 
(which we have not cited) do provide data 
consistent with self-evaluation interpreta- 
tions, but none of them bears on the rela- 
tionship of ordinal position to self-evalua- 
tion. 

In the experiment described below we 
hope to clarify some of the ambiguities of 
the Wrightsman study through independent 
manipulations of the opportunity for anx- 
iety reduction and self-evaluation. 

We shall start by making two assump- 
tions in regard to birth-order differences 
and then trace out the implications of these 
assumptions for behavior in a group set- 
ting: 


l. First borns have a greater need for 
self-evaluation than later borns. 

2. First borns have a greater need for 
anxiety reduction than later borns. 

If we now make some additional assump- 
tions which specify certain motivational 
determinants of liking for other persons, 
we shall be in a position to make some 
differential predictions for first borns and 
later borns. These assumptions are: 

1. The greater the number of needs one 
person is perceived as gratifying for an- 
other, the more that person is liked; and 
conversely, the greater the number of 
needs one person is perceived as frustrating 
for another, the more that person is dis- 
liked. 

2. The gratification of an important need 
generates more liking for a person per- 
ceived as instrumental to that gratification 
than does the gratifieation of an unimpor- 
tant need; and conversely, the frustration 
of an important need generates more dis- 
like for a person perceived as responsible 
for that frustration than does the frustra- 
tion of an unimportant need. 

The utility of similar or virtually equiv- 
alent assumptions has been demonstrated 
in the work of Schachter (1951), Festinger 
(1954), and Ring (1964). 

Finally, le& us now imagine the four fol- 
lowing conditions: (a) where both the 
needs for anxiety reduction and self-evalu- 
ation are gratified; (b) where the need for 
anxiety reduction is gratified, but the need 
for self-evaluation is frustrated; (c) where 
the need for self-evaluation is gratified, but 
the need for anxiety reduction is frustrated ; 
(d) where both needs are frustrated. Can 
we make differential predictions for first 
borns and later borns under these four 
conditions? 

From the first “liking” assumption alone 
it would clearly follow that an individual— 
whether he be first or later born—would 
most like individuals associated with a, 
least like those associated with d, and like 
to an intermediate degree those in b and c. 
However, the addition of Liking Assump- 
lion 2 makes it evident that this differen- 
tiation would be sharper ( i.e., the extent of 
the differenee between a and d would be 
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greater) for first-born individuals. In fact, 
to the extent that the needs for anxiety 
reduction and self-evaluation are not impor. 
tant at all for later borns (see Birth-Order 
Assumptions 1 and 2), one would, of 
course, expect to find for the latter no difs 
ferentiation among the four conditions in 
terms of liking; the differentiation would 
be anticipated only for first borns. 

We may therefore state as one hypothesis 
that the differentiation between a and d 
will be greater for first borns. Confirmation 
of this hypothesis will be taken as support 
for the assumptions that first borns exceed 
later borns in their need for anxiety reduc- 
tion and self-evaluation. 

It will be recalled that Wrightsman’s 
study was most unclear on the question of 
the relationship of birth order to self-evalu- 
ation. For this reason, two further tests of 
the self-evaluation assumption were planned 
and will now be briefly discussed. 

Tt is an axiom of social comparison the- 
ory (Festinger, 1954), which provides the 
theoretical underpinning for Schachter's 
(1959) work, that where discrepancies exist 
among the members of a group, pressures 
will arise which aet to reduce these dis 
crepancies. Theoretically, these pressures 
stem from the desire of the group members 
to evaluate themselves, and self-evalua- 
tions, so the theory goes, are likely to be 
unstable and unreliable where discrepancies 
exist. According to Festinger, one way t 
reduce discrepancies within a group is t0 
change one's own position to bring it closet 
to the modal opinion in the group. In othet 
words, movement toward either the “aver 
age" or modal position of the group is con- 
strued, within the framework of this the 
ory, as reflecting self-evaluative needs. 

If our self-evaluation assumption is CO 
rect, we ought to find greater tendencies 0! 
the part of the first borns to move towar 
the average or modal emotional level in tht 
group. When anxiety is the emotion in que* 
tion, movement in the direction of the 
group should occur no matter whether th 
group level of anxiety is higher or lowe’ 
than that of a given S. Such a suggestion ® 
of course, quite compatible with Schach- у 
ter's theorizing, for he says: | 


| 
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for first-borns...anxiety reduction is not an in- 
evitable consequence of being with others. First- 
born subjects ... аге more influencible than later- 
born subjects. If the tenor of a group discussion 
were to be really terrifying and anxiety provoking, 
it might indeed be anticipated that the anxiety of 
first-borns would be greater than that of later- 
borns [p. 112]. 


А clear test of this hypothesis requires some 
control over the group interaction; this we 
have attempted through the simple expedi- 
ent of using experimental confederates. 

A second test of the self-evaluation as- 
sumption is possible by means of recourse 
to some data collected in connection with 
another study (Ring & Braginsky, unpub- 
lished study).* In this study, Ss—who are 
comparable to those who took part in our 
experiment—were administered a newly de- 
vised test which was designed to measure 
the need for self-evaluation. It is easy to 
compare first-born and later-born respon- 
dents in terms of their total score on this 
test. We should certainly expect first borns 
to score significantly higher. Of course, it 
will be necessary to show that this test 
actually does measure the need for self- 
evaluation; data in this connection will be 
presented when the test is discussed subse- 
quently, 

To summarize, the major purpose of the 
Studies described below is to test the as- 
sumptions concerning birth-order differ- 
ences in the need for anxiety reduction and 
self-evaluation. If these assumptions and 
their implications are supported, they will 
Provide the groundwork for a theory which 
attempts to incorporate the findings of a 
number of studies dealing with different 
Correlates of ordinal position of birth. 


METHOD 
Subjects 


Ninety-six female undergraduate students served 
as 85 in the experiment. Sixty-four of these were 
allocated to one of four experimental treatments, 
while the remainder were used in one of two con- 
trol groups, АП Ss were drawn from a subject pool 
"med from different sections of the introductory 
Psychology course at the University of Connecti- 
‘tt. Participation is on a voluntary basis, and, in 

18 experiment, there was no financial compensa- 


“Unpublished study entitled “The measurement 
of Self-evaluation”; available from author. 
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tion. Exactly half of the Ss were first born," and 
half were later born. 


Experimental Design 


A 2 X 2 X 2 factorial design was used, each cell 
baving an equal representation of first-born and 
later-bom Ss. The two experimentally manipu- 
lated variables were level of group anxiety (high 
and low) and ease of self-evaluation (easy and diffi- 
eult). These variables were manipulated by using 
two female experimental confederates. In any 
given experimental session three individuals, ex- 
cluding E (experimenter), were present: the two 
stooges and a naive S. It is most convenient, in 
describing the experimental design, to consider the 
four experimental treatments separately : 

High Anziety-Easy Evaluation (HE). In this 
condition, both stooges were instructed and trained 
to act in a rather anxious fashion about the im- 
pending experiment during a 5-minute waiting pe- 
riod when E was out of the room. In terms of an 
anxiety dimension, which was used for training 
purposes, stooges were told to attempt to give an 
impression consistent with their being located at 
70 on the anxiety seale which extended from 0 to 
100. (Verbal definitions were provided for 6 points 
on the scale: 0—not at all anxious; 20—a little 
anxious; 40—somewhat anxious; 60—moderately 
anxious; 80—quite anxious; 100—extremely anx- 
ious.) Since the stooges would appear anxious, it 
was reasoned that it would be difficult, if not im- 
possible, for most Ss to reduce their own anxiety 
by using the stooges as reference persons. At the 
same time, since the stooges through their behav- 
ior would be evincing approximately the same 
level of anxiety, it should not be difficult for an 
S to decide how she ought to feel. In other words, 
in an unstructured situation there is uniformity of 
emotional state in the others present; self-evalua- 
tive needs, therefore, may be easily gratified here. 
To summarize, the behavior of the stooges in this 
condition was intended to frustrate the need for 
anxiety reduction, but permit the gratification of 
the need for self-evaluation. 

High Anziety-Difficult Evaluation (HD). Here, 
one stooge was instructed to represent herself as 
extremely anxious (100), while the other was told 
to appear somewhat anxious (40). It will be noted 
that the "average" of the anxiety levels of the 
stooges is precisely the same as in the HE condi- 
tion, that is, 70. Of course, it is nonsense to be- 
lieve that the perceived level of anxiety was the 
same in both conditions; it is a gross oversimpli- 
fication (but a necessary one) to say that individ- 
uals assess the level of emotionality in a group by 


з First-born Ss include both only children and 
children with siblings. We follow Schachter (1959) 
here, who eustomarily placed all first borns into а 
single category, regardless of number of siblings. 
In addition, when comparing first-born to only 
children, he failed to find any systematic differ- 
ences between them. 
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a simple averaging of the levels of each member— 
obviously, things are a little bit more complicated 
than that! Nevertheless, the introduction of a 60- 
point discrepancy between the stooges has the ad- 
vantage of making self-evaluation difficult (see p. 
4). Thus, this condition is intended to frustrate 
both the needs for anxiety reduction and self- 
evaluation. 

Low Anziety-Easy Evaluation (LE). Both 
stooges portray themselves as low in anxiety (30). 
Such behavior on their part should enable Ss to 
reduce their own anxiety and at the same time 
achieve seli-evaluation. 

Low Anziety-Difficult. Evaluation (LD). One 
stooge appears to be moderately anxious (60), the 
other, not anxious at all (0). Anxiety reduction 
ought to be possible here, but self-evaluation 
should be difficult. 


Control Conditions 


Together-No Talk (TNT). In each of the ex- 
perimental conditions, the stooges convey the ap- 
propriate level of anxiety primarily through verbal 
means and secondarily (but still importantly) 
through accompanying gestures and other expres- 
sive movements. In this condition, the stooges sat 
quietly and passively, after all three individuals 
(ie, stooges and S) had been enjoined by E not 
to talk after he left the room. 

Alone (A). In this condition, no stooges were 
ever present and, accordingly, the S spent the 
waiting period alone. 

It will be observed that our experimental de- 
sign is no more than an elaboration of Wrights- 
man’s original set of conditions. The four ex- 
perimental conditions represent refinements of 
Wrightsman's “together-talk” treatment, while our 
two control conditions, in all essential respects, 
parallel his. 

In each of the six conditions (both experimental 
and control), there was a total of 16 Ss, 8 first 
born and 8 later born. 


Procedure, Measuring Devices, and Checks 
on the Manipulations 


In all conditions, when S came to the experi- 
mental laboratory, she was escorted by E to a seat 
at a semicireular table. Except for the А condi- 
tion, S was situated between the two stooges. 
Typically, (unless 8 arrived quite early), one stooge 
was already seated by the time S came to the 
laboratory ; the other (who waited in an adjoining 
room) would characteristically arrive a few mo- 
ments after S. 

When all three were present (except for the А 
condition, of course), E introduced himself and the 
participants to one another. He explained very 
briefly that they would all be taking part in an ex- 
periment concerned with auditory stimulation, but 
before they could begin, it would be necessary to 
fill out some routine forms which, he said, would 
be self-explanatory. The E then distributed a ques- 


tionnaire labeled: “Auditory Stimulation Exper. 
ment Questionnaire—Form II-L.” Immediately 
after doing so, E apologetically commented that 
the “equipment wasn't ready yet” and it would be 
necessary for him to leave the room for a few 
minutes to make some adjustments. Before de. 
parting, E cautioned the group to be sure not to 
talk before everyone had finished *because we need 
to have your own independent answers.” He in- 
dicated that he didn't care whether they talked 
after that. The E then left the room and entered 
an observation room in order to make sure no talk- 
ing occurred during the prohibited interval (none 
ever did) and to monitor the performances of the 
stooges. The E remained absent for a period of $ 
minutes, 

The purpose of the questionnaire was explained 
as follows: 


In previous auditory experiments we have found 
that a very sensitive indicator of one's perform- 
ance is one's mood. One of our secondary pur- 
poses in conducting this study is to try to specify 
more exactly the relationships between the var- 
ious moods of a person and his behavior under 
conditions of auditory stimulation. Accordingly, 
we ask you to fill out the following questionnaire 
which is concerned with your own mood states. 
For each item, please answer in terms of your 
own feelings at this very moment. 


After some further instructions concerned with 
how to use the rating scales in the questionnaire, 
each S was asked to describe herself, on a 101-point 
scale demarcated at 5-point intervals, in terms of 
the following moods: depression (The exact word- 
ing was: “How depressed would you say you feel 
at this moment?”), excitement, cheerfulness, bore- 
dom, good humor, anxiety, pensiveness, sleepiness, 
irritability, and calmness. Obviously, most of those 
moods serve as filler items, Following each mood 
scale, another 101-point scale appeared on whi 
S was instructed to indicate her degree of con- 
fidence about the self-rating she had just made 
The purpose of these confidence scales will be 
made clear in a moment. 

Stooges paced themselves so that they would 
finish “answering” their questionnaires about the 
same time as did S. When all had finished, one 0 
the stooges initiated a conversation with the other 
two girls. Because S would inevitably take part " 
these conversations, it was not possible by any 
means to standardize them completely within an) 
given treatment. Stooges were given two princip? 
rules to guide them, however: (a) they were (0 
stick as closely as possible to certain standa 
comments (for a particular condition) that № 
been well practiced in rehearsal sessions and whi 
were deemed appropriate to the anxiety leve j 
stooge was supposed to convey, and (b) wher? 
stooges were to appear discrepant from one A 
other, they were cautioned to resist the prd 
toward uniformity (Festinger, 1950, 1954) s 
they were necessarily inflicting upon one another 
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The E returned aíter 5 minutes had elapsed, 
collected the completed questionnaires, and then 
distributed a fresh copy of the questionnaire to 
each girl. The E explained this action by saying, 
in effect, that moods are inherently subject to 
fluctuations over even very short periods of time 
and, in order to have the most accurate assessment 
of their moods, it was probably desirable to have 
the girls rate themselves again. The Æ concluded: 
"Two measures are better than one." 

The purpose of this readministration was two- 
fold. The mood reratings permitted us to assess 
the effectiveness of the manipulation of anxiety 
while the changes in confidence ratings constituted 
а check on the ease of evaluation manipulation, It 
follows from social comparison theory that a per- 
son should feel more confident about "what he is 
like" to the extent he can achieve self-evaluation. 
Therefore, confidence about one's self-ratings ought 
to increase, or at least not decline, under condi- 
tions of easy evaluation, but it should decline 
where self-evaluation is difficult. 

After collecting these questionnaires, E said in 
а casual and nonchalant way (as if this were all 
а routine matter prior to the "real" experiment) 
that, of course, some mood changes were spon- 
taneous, but that, to some extent, the moods of 
one person might be influenced by the moods of 
others. “In order to control for this,” E said, it 
would be necessary to take a few minutes so that 
each person could indicate how she thought the 
other two felt. For this purpose, two fresh copies 
of the previously used questionnaire were given to 
each person who was asked to describe the others 
in terms of it. 

The actual purpose of this administration was, 
of course, to check S's perception of the stooges in 
order to see whether their performance was suc- 
cessful in conveying the appropriate impression 
about their emotional state. 

The E now informed the girls that there was 
just one more item of business and then they could 
finally start the experiment. He explained that 
Some of the groups were going to be called back 
later on in the semester for continued testing and 
Since congenial groups were wanted, it would be 
helpful if the girls indicated as best they could 
how they felt about one another. Their answers, E 
continued, would of course be private and held in 
Strictest confidence. 4 

This sociometrie questionnaire contained items 
dealing with such matters as how interested S 
Would be in participating in future experiments 
with the same girls; how comfortable she felt with 

em; and, most importantly, how much she liked 
cach of them and why. These items are relevant 
to the predictions discussed in the introduction 
Concerning birth order and liking. d 

After S had completed her questionnaire, the 
stooges were dismissed, and the entire experiment 
Was explained to S alone. The reasons for the de- 
ceptions were indicated, and any questions S 
asked, answered, Most Ss were more amused than 
indignant about being fooled, and in no case did 


an S appear emotionally upset. The E, while en- 
gaged in this catharsis session, attempted to ascer- 
tain whether S had previously heard anything 
about the experiment or had suspected either that 
the other girls were experimental confederates or 
that there was to be no “auditory stimulation” ex- 
periment. About 10 Ss had to be eliminated for 
one or more of these reasons. Before 8 left, E 
made her promise not to reveal anything about 
the experiment to her friends. 


RESULTS 


Reallocation of Ss to Treatment Conditions 


Although the experiment was planned 
with the aim of equal № within cells, it 
became apparent upon analysis that this 
formal objective would, ironically, serve 
only to increase error variance. The reason 
lor this is as follows: generally speaking, 
the stooges did a good job in portraying the 
appropriate level of anxiety, but occasion- 
ally an S would of course assign anxiety 
levels to the stooges that were quite dis- 
сгерапё with those they intended to convey. 
(This discrepancy could be attributed 
either to a poor performance by one or both 
of the stooges or to the idiosyncratic per- 
ception of S; for present purposes, how- 
ever, the source of the discrepancy is irrele- 
vant.) Not only that, but an S's perception 
of the stooges might correspond very 
closely to how the stooges were supposed to 
appear in another condition. This would be 
true, for example, when both stooges in an 
LE condition would be rated on anxiety, 
say, at 65, a rating very close to the ideal 
for the HE condition. Psychologically, it 
would be nonsensical to retain such an S 
in the LE condition; the phenomenal state 
(and, hopefully, the emotional state) of 
such a person is elearly much more similar 
to the average HE S than to the LE S. 

On the basis of this reasoning, a reallo- 
cation of Ss to the most appropriate treat- 
ment condition—based on S's ratings of the 
stooges—was performed.  Reallocations 
were made independently by two of the 
authors, and the few disagreements that 
resulted were resolved through discussion 
afterwards. Here it must be noted that 
anxiety ratings were not the sole criterion 
for reallocation; ratings of “calm” were 
considered equally important for this pur- 
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TABLE 1 
DISTRIBUTION OF SUBJECTS ACCORDING TO ORIGINAL AND REALLOCATED CONDITION 


(SEPARATELY FOR FinsT-BORN AND LarER-BonN Supsecrs) 


Reallocated condition 


Original condition First borns Later borns 0 
HE LE HD LD HE LE HD LD 
HE 6 1 1 0 5 1 2 0 
LE 2 5 1 0 3 3 0 2 
HD 3 0 3 2 3 0 5 0 
LD 1 1 3 3 1 2 4 1 
7 8 5 12 6 11 3 


Total 12 


pose and ratings of “excited” were used as 
an ancillary eriterion. The reason for these 
multiple eriteria is that, from postexperi- 
mental questioning of Ss coupled with 
close inspection of the data, it seemed quite 
clear that there was no uniformity of inter- 
pretation of the term “anxious” by Ss. 
There are at least three distinct connota- 
tions this word could have: (a) excited, 
(b) eager with anticipation, and (c) appre- 
hensive and nervous. Since only the last of 
these is what we had wished this term to 
convey and since there is reason to believe 
that anxious was often taken to mean either 
excited or eager, it seemed wiser to utilize 
multiple criteria for this reallocation rather 
than depend upon a single and probably 
unreliable criterion. 

Ss were assigned to a particular treat- 
ment on the basis of the following rules: if 
the mean anxiety rating of the stooges was 
50 or less and the mean ealm rating 50 or 
more, an S was placed in a low-anxiety 
condition (decisions on ambiguous or in- 
consistent cases were made by considering 
also ratings of the stooges on excited) ; 
otherwise, an S was assigned to a high- 
anxiety condition. If the mean difference 
between the ratings of the two stooges, on 
both anxiety and calm, exceeded 20, an S 
would be assigned to a difficult-evaluation 
condition; if not, she would be assigned to 
an easy-evaluation condition.’ By following 


*]t might be contended that this reallocation 
procedure entails two dangers: (a) the differences 
between the conditions (on both the anxiety and 
evaluation dimensions) are much less sharp than 
originally intended (another way of putting this 
would be to say that within-cell variance, while 
it may be reduced through this procedure, is still 
likely to be very much larger than it would have 


these two rules, it was possible to allocate 
every S to one of the four treatment condi- 
tions of the experiment. This reallocation 
of Ss resulted in the cell frequencies shown 
in Table 1. The column sums represent the 
number of cases in each condition following 
reallocation. The numbers comprising the 
major diagonal represent Ss who remained 
in their original condition. 


Check on the Ease-of-Evaluation Manipu- 
lation 


In effect, the subject reallocation has 
made it unnecessary for us to ask whether 


been if the original design had proved feasible), 
and (b) it may be misleading to place together 
within the same cell Ss who perceived the stooges 
in a way consistent with the anxiety levels the 
latter were trying to convey (ie., those Ss whose 
cell membership was not affected by the realloca- 
tion procedure) with those who perceived the 
stooges in a way discrepant with the impression 
the latter were attempting to foster (i.e., those on 
whom the manipulation didn’t work). These are 
fair criticisms and together they suggest that only 
those Ss should have been used who “took” the 
manipulation as it was intended; all others should 
have been discarded. How may we answer these 
criticisms? 

Concerning the first-mentioned objection, We 
can say only that while it is true that the cells are 
indeed likely to be more heterogencous than we 
would have wished, if our hypotheses are bome 
out that will be so in spite of, rather than because 
of, this fact. Such variability patently operates 
against our finding differences between cells, but if 
they emerge, then the effects of the variables in- 
vestigated are surely even more potent than they 
appear. As for the second objection, much the same 
argument would seem to apply; in addition, there 
is really no firm operational basis for the claim 
that the two sets of Ss are different (their ratings 
of the stooges at least are equivalent), and, in fact, 
there is every reason to infer that their phenome 
nal worlds are comparable. 


à 


EE —— — 
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the anxiety manipulation was successful; 
we have made it “work” by assigning Ss 
to treatments on the basis of the anxiety 
ratings they gave to the stooges, independ- 
ently of whatever experimental manipula- 
tion they happened to receive originally. 
While this procedure likewise exempts us 
from having to assess the ease of evaluation 
manipulation (if it is defined simply in 
lerms of perceived discrepaney between 
stooges), it will be recalled that it was 
argued (see p. 7) that confidence concern- 
ing one's self-ratings ought to increase, or 
at least not decline, under conditions of 
easy evaluation, but ought to decrease 
where evaluation is difficult. We may check 
these implications by reference to the data 
in Table 2. 

It will be seen that, while there are essen- 
tially no differences among conditions for 
later borns, there is a suggestion of an in- 
crease in confidence under easy-evaluation 
conditions for first borns, whereas no 
marked change under difficult conditions is 
Apparent. This difference, however, falls 
short of significance due to the great vari- 
ability of the scores of first-born Ss (a 
matter to which we shall return in a mo- 
ment). Save for this trend, the implication 
from social comparison theory concerning 
the relationship of discrepancy between 
reference persons and changes in confidence 
of self-ratings is not supported. 

Before proceeding, it is necessary to ex- 
plain why, in Table 2, we have used the 
calm dimension instead of that of anxiety. 
The reason is quite straightforward: the 
Variability of interpretation of the word 
calm did not seem—again on the basis of 
Postexperimental questioning and inspec- 
Чоп of the data—to be nearly so great as 
Was the ease for anxiety, the ambiguities 


of which we have already considered. Ac- 
cordingly, we chose to use the calm dimen- 
sion (actually, the calmness scores were 
transformed to make them parallel to anx- 
iety scores by subtracting them from 100) 
consistently for all major analyses reported 
in this study, 


Birth-Order Differences in ( ‘onfidence Rat- 
ings 

In the analysis of the confidence data, 
certain ordinal position differences were 
noted which, while not predicted, fit in 
rather neatly with the birth-order theory 
to be considered later. We shall present 
these data here, but defer discussion of 
them for now. 

As can be seen in Table 2, the variability 
of change scores for first borns seems to be 
substantially greater than that of later 
borns. A variance test confirms this impres- 
sion (F = 8.73 with 31 df; р < .001). In 
fact, later borns hardly change at all, but 
the confidence of first borns often shifts 
dramatieally (in both directions). It should 
be noted that these fluctuations occur under 
all experimental conditions. 

A comparison of confidence scores on ini- 
tial self-ratings of calm for these same Ss 
disclosed that first borns were significantly 
less confident about their judgments (on 
this dimension) than later borns (70.47 for 
first borns, 81.41 for later borns; t = 2.08, 
p < .05). The question that immediately 
suggests itself is, of course, how generalized 
is this relative lack of confidence on the 
part of first borns concerning their self- 
ratings? While the range of judged char- 
acteristics used in this study was not very 
wide (only the 10 “moods”), it is possible 
to say that this relative lack of confidence 
did not extend even over these items. It did, 


TABLE 2 


MEANS AND STANDARD DEVIATIONS OF CHANGE IN CONFIDENCE RATINGS ON THE CALM 
Dimension (SECOND Ratinc—First RATING) 


Birth order i BE E - 

3 X 7.86 11.67 3.00 —1.25 
First borns f 24.13 20.04 18.57 11.58 
2.92 0.00 0.91 
Later borns 4 y 6.20 5.00 7.69 
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TABLE 3 
MEAN CHANGES IN Слім SELF-RATINGS (SECOND Ratinc—Firsr RATING) 
Birth order 6 LE LD HE HD 
First borns -7.14 — 12.00 —10.00 —28.12 
Later borns —15.83 —1.67 1.67 8.18 


interestingly enough, hold for anxiety 
75.78 for first borns, 88.28 for later borns; 
= 2.59, p < .02), but there were no other 
ignifieant birth-order differences in confi- 
dence ratings for any of the remaining 
moods. 


( 
t 


Birth-Order Differences in Influenceability 


It will be convenient to postpone the 
presentation of the data on liking until 
after we have considered the findings deal- 


An analysis of variance® of these data is 
presented in Table 4. 

While first borns, in general, become 
more afraid (less calm) over time, the effect 
is most marked under conditions of high 
anxiety and difficult evaluation. These 
trends are not apparent for later borns, 
who, if anything, seem to become more 
calm under conditions of high anxiety and 
difficult evaluation. Although the borderline 
significance levels of these effects suggest 
the need for considerable caution on this 


TABLE 4 
ANALYSIS OF VARIANCE OF CHANGES IN CALM SELF-RATINGS 
Source df SS MS F $ 
Evaluation (E) 1 4.32 4.32 — 
Anxiety (A) 1 57.12 57.12 — 
Ordinal position (OP) 1 2008.87 2008.87 3.61 05-.10* 
EXA 1 357.10 357.10 — 
EXOP 1 1555.68 1555.68 2.79 .10 
A X OP 1 1751.60 1751.60 3.15 .05-.10 
EXAXOP 1 25.73 25.73 ud 
Within 56 31187.54 556.92 


^ Attest between FBs and LBs proves significant at better than the .05 level. The mean change for 
FBs was —14.22, for LBs, 0.31; ( = 2.43, .02 < р < .05. Because of the unequal cell frequencies the 
ordinal position effect in the analysis of variance is quite discrepant with the ¢ test reported. 


ing with influenceability—findings which 
will aid in the interpretation of the liking 
data. 

There are two very closely related ques- 
tions here: (a) Are first borns more influ- 
eneable than later borns (see quote, p. 5)? 
and (b) Will first borns make stronger at- 
tempts than later borns to reduce the dis- 
crepancies between themselves and others 
by moving toward the perceived emotional 
level of others (presumably in the service 
of self-evaluative needs) ? 

The answer to the first question may be 
found by reference to the data in Table 3 
which presents the changes in calm self- 
ratings for first borns and later borns ac- 
cording to experimental treatment. 


matter, these data point to the following 
conclusion: the emotional state (in terms 
of calmness) of first borns seems to change 
in accordance with their social environ- 
ment; changes in the emotional state of 
later borns seem to be largely independent 
of this environment. In other words, first 
borns, as Schachter has asserted, seem (0 
be more influenceable. Possibly a more con- 
servative statement (more in keeping with 
the present data) would be that first borns 
are more susceptible to emotional conta- 
gion. 


° Because of unequal cell frequencies, all analy- 
ses of variance reported in this paper have bee? 
calculated using the method of unweighted mean? 
following the procedure outlined by Winer (1965 
p. 222 ff.). 


| 
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TABLE 5 
MEAN CONVERGENCE SCORES FOR THE Caim DIMENSION 
LE LD HE HD 
First borns —11.43 10.00 26.67 5.0 
Later borns —1.67 — 23.33 —9.17 —5.91 


A more precise test of the differential 
influenceability of first borns is available by 
taking into account the extent to which 
any given S moves toward or away from 
the perceived emotional level of the two 
stooges. This “convergence score,” as we 
have called it, is defined as follows: 


Convergence 
= (|SR: — Sı | — | SR: — Sı |) 
+ GSR; — $&| — | SR: — S: |) 


Il 


SR; = S’s initial self-rating on calm 
SR; = 5's second self-rating on calm 
Sı = S's rating of the first stooge оп calm 
$: = S’s rating of the second stooge on calm 


A positive convergence score shows a net 
tendency to move oneself closer to the level 
of the stooges; a negative score shows a net 
tendency to move away from the stooges; 
а convergence score of zero shows, of 
Course, no tendency to move either toward 
or away from the stooges. 

In Table 5 are the mean convergence 
Scores for each condition; the corresponding 
analysis of variance is presented in Table 6. 
_ The data here on birth-order differences 
In convergence are quite clear-cut: first 
borns show a significantly greater tendency 
to move toward others compared to later 
borns. This difference is, of course, pre- 


cisely what we would have expected on the 
assumption that first borns have a greater 
need for self-evaluation than later borns. 

The only other significant effect, the 
triple interaction, is in part understandable 
on the basis of the data on change in calm 
self-ratings, already discussed (see p. 10). 
The most important aspect of this inter- 
action concerns the HE cell: first borns 
move sharply toward the other stooges, 
even though this means that they are si- 
multaneously becoming markedly less 
calm; for later borns the opposite is the 
case. This finding would seem to have im- 
plications for the relative strengths of the 
needs for self-evaluation and anxiety re- 
duction; these implications will be pursued 
in the discussion section. 

It is possible to conceive of even a 
“purer” test of these convergence effects. 
Let us take all Ss who show a change in 
calm self-rating and who, in terms of their 
initial self-rating, perceive both stooges as 
either higher or lower (in calmness) than 
themselves. Then, let us see whether these 
Ss move toward or away from the stooges. 
There are 28 Ss who meet these specifica- 
tions, and their data are presented in Table 
7: 

Тһе birth-order differences are striking: 
all 14 first borns move in the direction of 
the stooges; only half of the later borns do 


TABLE 6 
ANALYSIS OF VARIANCE OF CONVERGENCE SCORES ON THE CALM DIMENSION 

Source df SS MS F p 
Evaluation (E 1 858.46 858.46 — 
Anxiety (Ay ; 1 697.23 697.23 z 
Ordinal position (OP) 1 2691.81 2691.81 4.51 «.05 
EXA 1 522.02 522.02 uc 
Ex op 1 116.70 116.70 m 
A x OP 1 234.75 j x. à 
EX A x OP 1 5340.43 5340.43 8.94 « .005 
Within 56 33448. 59 597.30 
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TABLE 7 
DIRECTION оғ MOVEMENT Vis-a-Vis 
STOOGES ACCORDING TO BIRTH ORDER 


Birth order Toward Away 
First born 14 0 
7 


Later born 7 


so. This difference is significant at the .01 
level by Fisher's exact test. 

Before turning to the liking data, it is 
appropriate to ask, as we did for confidence 
ratings earlier, whether this differential 
tendeney toward convergence is evident 
over all moods. The answer, as before, is 
that this is not а generalized disposition of 
first borns. The mean convergence scores 
for each of the 10 moods are presented in 
Table 8. 

While there is obviously no overall dif- 
ference between first borns and later borns, 
it is illuminating to note that birth-order 
differences in convergence are greatest 
(favoring first borns) on the “emotionality 
triad” consisting of calmness, excitement, 
and anxiety. If we compute a difference 
score for each S taking his mean conver- 
gence score on the “emotionality triad” and 
his mean convergence score on all remain- 
ing moods, we find for first borns a signifi- 
cantly greater (z = 2.13, p < .05 by a rank 
sum test) tendency toward convergence on 
the emotionality triad, while there is no 
difference for later borns. The difference 
between the difference scores for first borns 


TABLE 8 
MEAN CONVERGENCE SCORES ON ALL 
Moons FOR Етвѕт Borns AND 
LATER Borns 


Mood FB LB Difference 
Calm 7.81 —1.50 15.31 
Excited 14.68 6.56 8.12 
Anxious 11.88 7.19 4.69 
Cheerful 2.82 0.94 1.86 
Depressed 2.18 0.62 1.56 
Good humored 1.56 3.75 —2.19 
Bored 4.06 6.88 —2.82 
Sleepy 0.00 3:12 4 Эйр 
Irritable —2.50 5.31 2081 
Pensive 4.38 13.44 —9.06 


and later borns approaches significance 
(г = 1.92, р = .06). 

These results clarify our birth-order dif- 
ferences on influenceability. We conclude 
that first borns are more influenceable than 
later borns and more likely to reduce dis- 
erepancies between themselves and others 
by moving toward others, but only on di- - 
mensions of emotionality. Even here, of — 
course, we have demonstrated this only for 
one type of emotion, that comprised by 
calm-anxiety-excitement. Whether these 
differences hold for other emotional states 
(e.g., euphoria) is a question for further 
research.* 


In the introduction, we argued, with the 
aid of supplementary assumptions, that if 
first borns have both a greater need for 
self-evaluation and anxiety-reduction, they 
should like most the stooges in the LE con- 
dition, like least those in the HD condi- 
tion, and like to an intermediate extent 
those in the remaining conditions. These 
same general trends were also predicted for 
later borns, but it was maintained that 
they would be much weaker, if indeed they 
appeared at all (see p. 4). On the last of 
the postexperimental questionnaires, we 
asked Ss to answer items concerned with 
the following matters: 

l. how interested they would be in par- 
ticipating in future experiments with the 
same girls; 

2. how comfortable they felt waiting 
with these girls; 

3. whether, if they were called back for 
further experiments, they would prefer to 


L 
Birth Order and Liking for Stooges 


"Obviously, alternative interpretations are pos 
sible here and need to be refuted before our 
"emotionality" hypothesis can be held with any 
confidence. Possibly the moods constituting the 
emotionality triad are simply more important in 
interpersonal settings, and their emotional char- 
acter is really irrelevant. Also, it might be argue 
that since we were in effect trying to manipulate 
this emotion, differences here might be expecte 
to be clearest; perhaps if we had manipulated, 54y, 
depression, birth-order differences would have bee?» 
most marked on that dimension. In any case, thes? 
interpretations are easily tested. 
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wait alone, with these girls, or with other 
girls; 

4. how much they liked each of the girls. 

For this analysis, two scores were caleu- 
lated: (a) the sum of the liking ratings 
received by the two stooges (the liking 
score) and (b) the sum of the ratings of all 
four items (the total attractiveness score). 
The results of this analysis, for both de- 
pendent variables, are shown in Table 9. 

Inspection of the data discloses that the 
trends are, in general, as predicted: for first 
borns, stooges are most favorably evalu- 
ated in the LE condition, most unfavorably 
evaluated in the HD condition, while the 
other two conditions fall in between. For 
later borns, no systematie differences ap- 
pear except that stooges in the LD condi- 
tion are rated most favorably. This finding 
should not be given much weight, however, 
inasmuch as this cell contains the smallest 
number of Ss (three) in the experiment. 

We must hasten to add that neither anal- 


_ ysis of variance shows any significant ef- 


fects. These data, then, while “in the right 
direction,” cannot by any means be taken 
às supportive of the hypothesis. 

Of the many possible reasons why this 
hypothesis was not confirmed, most crucial 
for our theoretical purposes are those re- 
lating to the assumptions linking the satis- 
faction of self-evaluation and anxiety-re- 
duction needs to liking. It may prove 
illuminating to test some further implica- 
tions of these assumptions in order to assess 
the factors responsible for this failure of 
Prediction. Since we are primarily con- 
cerned with birth-order differences in self- 
evaluation and anxiety reduction, the anal- 


yses that follow ignore the experimental 
treatment to which Ss were allocated. 

One testable implication of the self- 
evaluation assumption is that, other things 
being equal, one should tend to like those 
who are similar to oneself; such persons 
are more likely to be able to satisfy one’s 
need for self-evaluation than persons who 
are dissimilar (Festinger, 1954). It seems 
reasonable to assume that those stooges 
who are perceived to be “close” to S’s own 
level of calmness ought also to be per- 
ceived as more similar than those who are 
more distant. 

One test of this implication would con- 
sist of taking, for each S, the difference 
between his liking rating for the stooge 
closer to his own self-rating and that as- 
signed to the more distant stooge. There 
would then be a single observation, in the 
form of a difference score, for each S. 
When this type of analysis was performed 
using calm self-ratings it was found that 
there was only a slight and insignificant 
tendeney for first borns to like relatively 
more than later borns those stooges closer 
to their own (second) self-rating. However, 
we happened to make exactly the same 
analysis using second self-ratings on anx- 
iety, and quite divergent results emerged 
which seemed sufficiently intriguing to war- 
rant inclusion here. It should be explicitly 
stated at the outset, however, that these 
findings must be interpreted with caution 
inasmuch as (a) they are based on a dií- 
ferent dependent variable from that used 
throughout this study and (b) similar re- 
sults were not obtained when using calm 
scores. We are frankly at a loss to account 


TABLE 9 
MEAN LIKING AND TOTAL ATTRACTIVENESS SCORES 
LE LD HE HD 

First borns 

TIERE D 20 10.58 Бр 
М Attractiveness 10.00 . . 

ater borns 

rds Bre 15.00 11.08 tis 

Attractiveness 11.00 Я 

In each ease, а low 


Note.—Liking scores may range from 2 to 14; attractiveness scores, from 5 to 23. 


Score indieates a favorable response. 
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TABLE 10 


Mean LIKING R4 


GS FOR “CLOSE” AND “DISTANT” SroocEs BASED ON SECOND ANXIETY 
SELr-RaTINGS, Using ЕЁлсн Sussect as His OWN CONTROL 


Birth order Close Distant Mean difference* $ 
First borns (24)^ 2.66 3.42 —.76 «01* 
Later borns (28)^ 3.11 3.07 .04 n.8. 


* The difference between these difference scores is significant at approximately the .02 level (1 = 2.39 


with 50 df). 


è The numbers in parentheses represent individuals. All Ss who assigned exactly the same rating 
to both stooges (eight first borns, four later borns) had, of course, to be eliminated from this analysis. 


et = 2.88 with 23 df. 


for this discrepancy which represents the 
major inconsistency of the study. 

The results of this analysis are presented 
in Table 10. 

It will be observed that first borns tend 
to like those close to themselves and, rela- 
tively, to dislike those who are more dis- 
tant. No relationship between distance and 
liking is apparent for later borns. It is 
obvious that these findings offer strong 
support to the hypothesis that first borns, 
as compared to later borns, have a greater 
need for self-evaluation. 

Two additional findings relevant to this 
analysis should be mentioned: (a) for 
first borns, sheer distance from one another, 
rather than the direction of the distance, 
seems to be the crucial factor in accounting 
for relative dislike: that is, those who were 
less anxious were liked no better than those 
who were more anxious (it is apparent that 
such a finding is consistent with a self- 
evaluation hypothesis, but contradicts an 
anxiety-reduction interpretation; we shall 
return to this point); (b) these birth-order 


TABLE 11 
MEAN LIKING Scores ror STOOGES as A 
FUNCTION or MOVING TOWARD OR 
AWAY FROM THEM ON CALM 


Birth order Toward Away 
First born 3.00 (32) 3.00 (19) 
Later born 3.23 (22) 2.81 (31) 


Note.—Numbers in parentheses represent ob- 
servations, not individuals; no significance tests 
were applied because of the nonindependence 
resulting from the fact that there are two ratings 
from each S. The liking ratings of all those Ss 
whose ealm Self-ratings did not change are, of 
course, not ineluded here. 


differences cannot be attributed to differ- 
ences in the average distance between one's 
self-rating and those assigned to the 
stooges; there were no significant differ- | 
ences between first borns and later borns in 
the average distance between Ss and either 
the close stooge or the distant one. 

For the sake of thoroughness we shall 
mention one further implication of the self- 
evaluation assumption which we investi- 
gated. It ean be argued that a person to- 
ward whom one converges (in the sense of 
moving toward the perceived emotional 
level of that person) is, other things being 
equal, more likely to satisfy one's need for | 
self-evaluation than a person from whom 
one diverges. If this is so, then it follows 
from the assumptions stated in the intro- 
duction that the former ought to be better 
liked than the latter. Table 11 presents the 
data. k 

There is manifestly no evidence to justify 
the supposed linkage between approach 


assumption relating anxiety reduction t0 
liking. l 

If anxiety reduction leads to liking, ЇЇ 
seems reasonable to predict that stooges 
who appear to be calm will, in general, be 
better liked than those who do not. There 
are two ways this prediction might b 
tested. One could compare the mean liking 
ratings assigned to the stooges who 4 
low, intermediate, and high on the cal" 
variable. One could also take as a reference 
point an S's self-rating and compare the 
mean liking rating assigned to those who 


tendencies and liking. 

Let us now turn to implications of the | 
| 
fall above and below S's own self-ratiné | 


| 
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We have done both, and these comparisons 
appear separately in Tables 12 and 13. 

While the second of these analyses (Ta- 
ble 13) reveals no birth-order differences, 
the first (Table 12) is extremely provoca- 
tive. Not only are the trends in liking di- 
rectly opposite for first borns and later 
borns, but they are exactly the reverse of 
what we would have expected on the as- 
sumption that first borns have a greater 
need for anxiety reduction. These data 
show, on the contrary, that for first borns, 
the less calm a stooge appears the more he 
is liked. It is, surprisingly, later borns who 
seem to like calm-appearing stooges. The 
data in Table 12, it should be noted, also 
fail to support the assumption mentioned 
above as do the data previously reported 
in connection with anxiety and liking (see 
p. 14). 

The gist of all these findings seems to be 
that later borns, if anything, have the 
greater need for anxiety reduction. It is 
patently true, however, that we need better 
data to document this assertion. Happily, 
another implication of the anxiety-reduc- 
tion-liking assumption сап supply them. 

If we like people who reduce our anxiety 
for us, then it's clear we ought to like those 
in whose presence our anxiety decreases 
(and who, presumably, are seen as instru- 
mental in that decrease) and, relatively, 
to dislike those in whose presence our anx- 
lety increases. If, further, it is true that it 
18 later borns, rather than first borns, who 
have a stronger need for anxiety reduction, 
this difference in liking ought to be most 
apparent for them. Since the sum of the 
liking ratings for the two stooges may be 
used here for each S, it is again possible to 
Perform a much needed test of significance. 


TABLE 13 
Mean LIKING RaTINGS ASSIGNED TO 
Srooces Мно Were Errner HIGHER 
ок Lower ох THE CALM DIMENSION 
Re.ative TO S's SEcOND 
BELF-RaTING 


Birth order Higher Lower 
First born 


Later born 


2.79 (29) 
2.95 (21) 


2.92 (25) 
3.12 (34) 


Note.—N's represent observations, not in- 
dividuals. No significance tests were applied for 
reasons stated previously. The liking ratings of 
Ss whose self-rating coincided with that given to 
а stooge are, of course, not included here. 


The results of this analysis are shown in 
Table 14. 

The most straightforward test of our 
newly formulated hypothesis involves a 
comparison of "inereasers" with “decreas- 
ers.” As expected, there is no difference for 
first borns, but a rank-sum test reveals that 
the stooges associated with later born 
“decreasers” tend to be significantly less 
liked compared to the stooges in whose 
presence later borns increase in calmness 
(2 = 1.92, р = .06). It is clear that if “no 
changers” had been classified with “de- 
creasers” (on the assumption that they, too, 
failed to experience any anxiety reduction), 
this difference would have been even more 
marked. Thus, it appears that we shall have 
to reverse ourselves on the question of the 
relationship between birth order and the 
need for anxiety reduction. These analyses 
seem to force the conclusion that the need 
is greater for later borns. 


Control Groups: Changes in Calm Self- 
Ratings 


Additional evidence bearing on the rela- 
tionship between birth order and anxiety 


TABLE 12 
s Low (0-30) 
MEAN L RATINGS ASSIGNED TO Srooces WHO WERE PERCEIVED AS ¢ 
pi INTERMEDIATE (31-70), AND Нїєн (71-100) ох CALMNESS 


Birth order Low Intermediate High 

i 1 19) 3.08 (12) 
First borns 2.82 (33) 2.89 ( 
Later borns 3.28 (29) 2.90 (20) 2.73 (15) 


3 :adivi Y igni vere applied for reasons 
Note.—N's represent observations, not individuals; no signifieance tests w pp 


stated previously. 
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TABLE 14 
Mean LIKING RATINGS ror THE Two STOOGES COMBINED AS A FUNCTION OF S's Direction 
or CHANGE IN Слім SELF-RATINGS 


Birth ee Increase No change Decrease 
First born 6.12 (8) 4.33 (6) 6.11 (18) 
Later born 5.21 (14) 6.60 (5) 6.77 (13) 


Note.—Liking scores may range from 2 to 14. Numbers in parentheses represent individuals. 


reduction is obtainable through a compari- 
son of the experimental conditions, taken 
together, with the control groups. It will be 
recalled that Wrightsman found a signifi- 
cant decrease in anxiety for first borns in 
his together conditions (compared with his 
A condition) and no such relationship for 
later borns. On the basis of these data, he 
argued that first borns, but not later borns, 
choose to affiliate with others in part be- 
cause of the anxiety-reducing functions 
that these others serve. 

In the present experiment we have in ef- 
fect replicated two of Wrightsman's condi- 
tions, TNT and A. We present the changes 
in calm self-ratings for these two conditions 
as well as for all experimental conditions 
combined in Table 15. In no case are there 
any significant-within-condition birth-or- 
der differences on initial self-ratings of 
calmness. 

It is plain that not only have we not 
replicated ^ Wrightsman's findings but, 
rather, that our data flatly are at variance 
with his (although, it should be noted, they 
are eonsistent with what we ourselves had 
found earlier). For first borns, in the ex- 
perimental conditions (which may be 
roughly equated with Wrightsman’s TNT 


condition), there is a significant decrease in 
calmness (# = 3.00, p < .01); Wrightsman 
found the opposite. There is likewise a de- 
crease in calmness in the T-T condition 
(nonsignificant); Wrightsman again found 
the opposite. In the A condition, there is 
virtually no change; Wrightsman found a 
nonsignificant decline. Table 15 shows that 
first borns are most likely to maintain their 
initial ealmness when they wait alone and 
are least likely to do so under together con- 
ditions. Wrightsman's data suggest that 
calmness will actually increase while wait- 
ing with others. 

For later borns, we find a slight tendency 
to become more calm in the presence of 
others; so does Wrightsman. But where he 
finds similar effects when later borns wait 
alone, our later borns, on the average, show 
a sharp decline in calmness when they re- 
main by themselves (the small N keeps 
this decline from being statistically signifi- 
cant). 

Granting that the sample sizes of the 
control groups render firm generalizations 
impossible, the data in Table 15 neverthe- 
less suggest the following hypothesis: in an 
ambiguous situation, being with others per- 
mits a later born to retain his initial state 


TABLE 15 
MEAN CHANGES or SgLr-RATINGS ON CALMNESS FOR EXPERIMENTAL AND CONTROL GROUPS 
First rating Second rating Change r 2 
First born 
Experimental (32) 48.91 34.69 <.01 
Control (16) Ке 
poo (8) 66.88 63.12 —3.76 п.В. 
58.75 58.12 — 8. 
Later born p "a 
Experimental (32) 54.84 55.16 8. 
Control (16) л ~ 
TNT (8) 60.62 63.75 3.13 n.8. 
A (8) 55.62 43.12 —12.50 n.8. 


| 


| 
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of calmness; if he is by himself, he becomes 
afraid. A first born, however, becomes less 
calm when he is with others (whether the 
others are seen as calm or not), but main- 
tains his initial level of calmness when 
alone. In other words, affiliation seems to 
serve an anxiety-reducing function for later 
borns and an anxiety-arousing one for first 
borns. It may be, as Schachter (1959) and 
many others have clearly demonstrated, 
that first borns prefer—much more so than 
later borns—to wait with others when 
afraid, but we would submit that this oc- 
curs despite the fact that others raise the 
anxiety level of first borns, not because 
they lower it. Why the effeet of other per- 
sons appears to be so different for first 
borns and later borns is, obviously, a very 
intriguing question to which we hope to 
provide a partial answer in our discussion. 

Of course, all this suggests once more 
that the need for anxiety reduction is a 
more important determinant of affiliation 
for later borns than for first borns. There 
is, however, one possible artifact that blurs 
the picture a little. If we look at the termi- 
hal self-ratings on calmness for Ss in the 
experimental conditions, we note a striking 
difference between first borns and later 
borns (Table 16). 

It is remarkable that not a single first 
born at the conclusion of the 5-minute in- 
teraction describes herself as relatively 
calm (ie. rates herself higher than 70 on 
the 100 point scale). In contrast, 43% of 
the later borns depict themselves as calm. 

One might argue, therefore, that the rea- 
Son first borns don’t seem to like calm per- 
‘ons (see p. 15), that they don't particu- 
larly like those in whose presence their own 
calmness increases (p. 29-80), and that 
they seem to have a weaker need for anx- 
lety reduction, is simply because in no case 
do they ever become very calm themselves. 
Perhaps these conclusions would be altered 
ЇЇ some of them did achieve relative calm- 
hess. It is really impossible to say, how- 
ever, because even an internal analysis is 
Precluded due to the complete absence of 
‘ny first borns in the “high calm” cell with 
Whom to compare later borns. Even sup- 
Posing, for the moment, that our conclu- 


TABLE 16 
TERMINAL SgLr-RaTINGS ок CALMNESS 
FOR EXPERIMENTAL Ss 


Birth order (Low (0-30) Medium (51-70) High (71-100) 
First born 16 1% 0 
Later born 10 9 13 


x? = 1634, p < .0005 


sions about ordinal position and anxiety 
reduction might have to be modified in the 
light of new data, the question remains: 
Why is it that not one first born, after the 
interaction, rated herself as calm, while 
nearly half of the later borns did so? 


Birth-Order Differences on the Self-Evalu- 
ation Questionnaire 


In the introduction we referred to a 
study (Ring & Braginsky, unpublished 
study) in which we attempted to measure 
the need for self-evaluation in a group of 
Ss comparable to those used in our experi- 
ment. For this endeavor we designed a 
self-evaluation questionnaire (labeled for 
administration purposes a “personality in- 
ventory”). One of the scores that can be 
derived from this instrument pertains to 
the need for self-evaluation concerning 
one’s personality characteristics." Fifty- 
seven female Ss (none of whom took part 
in the experiment reported above) for 
whom birth-order data are available took 
this questionnaire; 29 were first born, 28 
later born. 

The mean self-evaluation score of first- 
born Ss was significantly higher (t — 2.37, 
p « .05) than that of later borns. Thus we 
have evidenee consistent with that con- 
sidered earlier indicating a stronger self- 
evaluative need for first borns. It should be 
reemphasized, however, that self-evaluation 
in the present context refers to the evalua- 
tion of personality characteristics, not to 
emotional states. Just how generalized the 
need for evaluation is cannot, of course, be 
answered here, but it is an extremely im- 


* The derivation of this score and the logie which 
underlies it may be found in the original report, a 
copy of which may be obtained by writing the 
senior author. 
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portant question and one with which we 
shall deal in the discussion section. 

Work on the validation of this question- 
naire is currently in progress, but one 
validation experiment has already been 
exeeuted in connection with the question- 
naire study itself. At the time of the test 
administration, Ss were told that, if they 
wanted to find out their results on the test, 
they could come later that week to a psy- 
chology department office where that infor- 
mation would be available, It was reasoned 
that if this questionnaire really did meas- 
ure the need for self-evaluation, the mean 
score of those who appeared for their re- 
sults ought to be higher than that of those 
who failed to come. This expectation was 
confirmed (t = 2.38, p < .05). 


Discussion 
Let us, at this point, recapitulate the 
major findings and conclusions of the ex- 


periment. We have presented data which 
indicate that: 


Confidence 


1. Confidence ratings of first borns on 
calmness are more variable than later borns. 
2. First borns are less confident about 
their self-ratings of calmness and anxiety 
than later borns. 3 


Influenceability 


1. Being with others in an ambiguous, 
possibly dangerous, situation is anxiety- 
arousing for first borns, not so for later 
borns. 

2. In general, changes in the emotional 
state of first borns seem to depend on the 
nature of the social environment to a much 
greater extent than is true for later borns; 
specifically, first borns show Stronger tend- 
encies than later borns to converge toward 
the emotional level of others. 


Liking 
«Js First borns tend to like those who are 
like (i.e., similar to) themselves, 
. 2. Later borns tend to like those who 
induce a decrease in their anxiety level. 


On the basis of these, and other data, we 
conelude that: (a) first borns (as we as- 


sumed in the introduction) have a greater 
need for self-evaluation than do later borns 
and, with somewhat lesser confidence, (b) 
later borns have a greater need for anxiety 
reduction than do first borns (initially we 
had assumed precisely the opposite). 

It will be recalled that in the introdue- 
tion we also made two additional assump- 
tions concerned with the determinants of 
liking (see p. 4). In conjunction with the 
assumptions alluded to in the preceding 
paragraph, we were then able to predict the 
relative liking which stooges in the various 
conditions should receive. Inasmuch as it 
now appears that later borns, rather than 
first borns, have the greater need for anx- 
lety reduction, these specific predictions 
were bound to fail. The validity of the two 
assumptions specifying motivational deter- | 
minants of liking is another question, how- 
ever, and one, it should be pointed out, that 
was not of direct concern here. Such find- 
ings as first borns tend to like those who 
are similar to themselves and later born: 
those who reduce their anxiety can obvi- 
ously not be cited as data relevant to these 
assumptions, since it was in part from these 
very results that the needs for self-evalus- , 
tion and anxiety reduction, respectively. 
were inferred. For data pertinent to the 
validity of these assumptions, the reader is 
referred to the work of Ring (1963, 1964). 

Quite clearly, we think, the major finding 
of this study (as well as the one which i 
best supported by the data) has to do with 
the greater influenceability of first-born in- 
dividuals. Inasmuch as it is specifically 
their emotional state which seems to bt] 
more susceptible to social influence, ¥} 
might more aceurately conclude that firs 
borns show greater evidence of emotioni 
contagion than later borns: that is, they 
seem to “catch” and reflect the level ® 
emotionality in a group more so than late 
borns. 

This finding is especially noteworthy fo! 
two reasons, In the first place, it is no 
clear that the greater influenceability of first 
borns is not restricted to matters of opinio 
(see discussion, p. 4) but extends to the 
realm of emotions as well. Secondly, V^; 
have provided evidence which substantiae 
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Schachter’s implicit and mandatory as- 
sumption that first borns have a greater 
need for self-evaluation than do later borns. 

Incidentally, we do not mean to suggest 
that our particular interpretation of this 
contagion effeet—in terms of self-evalua- 
tive needs—is the only one possible, There 
аге of course always alternative explana- 
tions for any given result. What makes the 
self-evaluation | interpretation reasonably 
compelling, we think, is that it is not only 
consistent with the influenceability data, but 
with the data on confidence and liking as 
well. Thus all the principal results of our 
experiment point toward (or at least can be 
"handled" by) the same interpretation. We 
would argue, then, that while another al- 
ternative explanation might as plausibly 
account for our influenceability data, no one 
alternative is likely to account for all of 
our data so well as the self-evaluation in- 
terpretation. 

We shall now present a developmental 
theory which we believe can “make sense” 
of the principal results of this study as well 
as reconcile many disparate findings, re- 
lated to birth order, from other investiga- 
tions. We view this formulation with 
considerable tentativeness, of course, and 
we make no claim that many of the ideas it 
embodies have originated with us. Quite 
the contrary; we acknowledge especially 
the prior work of Schachter (1959) and 
Zimbardo and Formica (1963). 

The starting point of this theory is the 
relatively inconsistent treatment the first 
born child receives from his parents (there 
is evidence that first borns are handled 
Nore inconsistently than later borns; see 
Sears, Maccoby, & Levin, 1957). If we 
assume, following such pioneers as Cooley 
(1902) and Mead (1934), that our self- 
Conceptions reflect the ways significant 
others respond to us, then, plainly, the self- 
Conceptions of first borns ought to be more 
confused (i.e., internally inconsistent) than 
those of later borns, Relative to the later 
born, the first born should be less certain 
What he, as a person, is like or, indeed, less 
certain about who he “really” is. In cur- 
rently fashionable parlance, the first born 
may be regarded as haying an “identity 


problem.” If this is so, then we would ex- 
pect to see him display behavior directed 
toward reducing these uncertainties about 
himself: that is, he should seek self-evalua- 
tion. Much of the first born’s interpersonal 
behavior, we would argue, may be inter- 
preted as reflecting essentially the Socratic 
dictum to “know thyself.” 

What sort of parental behavior is it that 
makes a first born uncertain about him- 
self? Zimbardo and Formica (1963), draw- 
ing upon the work of Sears et al. (1957), 
have suggested that in part it is the gap 
between what the parents expect of the 
first born (Sears et al. have data which 
indicate that parents have higher aspira- 
tions for their first borns than for their 
later borns and expect more achievement 
from them) and what the first born is in 
fact capable of that leads him to feel unsure 
of himself and his abilities. Since there is 
no reason to think that first borns are any 
brighter than later borns (Murphy, Mur- 
phy, & Newcomb, 1937), the discrepancy 
between parentally instilled aspirations and 
ability should be greater for first borns 
than for later borns. What are some of the 
behavioral implications of this diserep- 
ancy? 

Zimbardo and Formica have proposed 
that the greater this discrepancy, the lower 
a person's self-esteem. First borns, then, 
ought to have lower self-esteem than later 
borns; Zimbardo and Formica's data on 
this point tend to support this conclusion, 
although their birth-order differences are 
rather slight. Two consequences of low 
self-esteem seem especially worth noting. 

Janis (Hovland & Janis, 1959) and his 
associates have shown that self-esteem is 
inversely related to influenceability. And it 
has been shown that first borns are, in fact, 
more influenceable than later borns (Ehrlich 
1958; Becker & Carroll, 1962; Staples & 
Walters, 1961; as well as the present study). 

People with low self-esteem seem to be 
influenceable in part because they are not 
sure about themselves. One sign of an un- 
certain self-evaluation should be lack of 
confidence concerning one's self-judgments. 
Tt will be reealled that we found first borns 
to be significantly less confident about cer- 
tain of their self-ratings than later borns. 
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It is certainly a testable hypothesis that 
lack of self-confidence and need for self- 
evaluation tend to be associated and that 
both are more characteristic of first borns.* 

We have argued that, compared to later 
borns, first borns are expected to do more 
than they can and that falling short of 
their aspirations leads to low self-esteem 
and lack of self-confidence. If it is true, 
however, that first borns do have higher 
aspirations, we ought to be able to find cor- 
roborative evidence in the literature. A re- 
view of relevant studies reveals the follow- 
ing information: first borns have higher 
need for achievement (Sampson, 1962), 
they get better grades in school, and they 
are more likely to go on to college and 
graduate school (Schachter, 1963).* These 
findings are, of course, consistent with the 
assumption of higher aspirations (at least 
in the intellectual realm) for first borns. 
More data are clearly needed to buttress 
this point. 

Let us now follow out some implications 
of the hypothesis that first borns have 
relatively unstable self-evaluations and, 
hence, a greater need for self-evaluation. 
While this need сап to some extent be 
satisfied through impersonal transaction 
with the environment, Festinger ( 1950, 
1954) has amply demonstrated that social 
interaction which permits interpersonal 
comparison is usually required. Other 
people, in effect, need to “tell” us what we 
are like. If first borns are more in need of 
this kind of information, they ought to be 
more dependent on others and should be 
more likely to seek out others when espe- 
cially in need of self-evaluation. This is, of 
course, precisely the interpretative line that 
Schachter (1959) takes. Haeberle's (1958) 
research on dependence in children (cited 
by Schachter) supports this interpretation 


* Zimbardo and Formica also conclude that the 
self-evaluations of first borns will be less stable, 
but their derivation is different and more ingen- 
ious: “A further assumption is that later borns 
use their older siblings, rather than their parents, 
as points of reference in evaluating their rate of 
physical development, opinions, abilities, and emo- 
tions. The inevitable conclusion is that the ‘dis- 
tant’ comparison referents used by first-borns, 
being unrealistic, will lead them to unstable self- 
evaluation [p. 159]." 


as do the many studies dealing with the 
relationship between birth order and аа. 
tion (Gerard, 1963; Gerard and Rabbie, 
1961; Radloff, 1961; Sarnoff and Zimbardo, 
1961; Schachter, 1959; Zimbardo and For- 
mica, 1963). 

Our own experiment furnishes data con- 
gruent with these implications: we showed 
that first borns were more likely to evalu- 
ate themselves in terms of the behavior of 
others: that is, first borns seemed to be 
more socially responsive. This is the type 
of behavior we would anticipate if first 
borns are preoccupied, as we have asserted 
they are, with achieving self-evaluation. 
"They need to pay attention to the behavior 
of others in order to evaluate themselves. 

In this connection, the work of Stotland 
(Stotland & Cottrell, 1962; Stotland & 
Dunn, 1963; Stotland & Walsh, 1963) and 
his colleagues on birth order, self-esteem, 
and empathy becomes relevant. In general, 
they find first borns—surprisingly at first 
blush—to be less empathic than later borns, 
but explain this as being due to the fact 
that “The first and only borns... react as 
if they use the other person’s performance 
level as a guide to self-evaluation, and do 
not really ‘feel with’ him [Stotland & Dunn, 
p. 539]." 

Although Stotland and his associates do 
not explicitly diseuss the relationship be- 
tween birth order and self-esteem, we have 
already seen (see p. 19) that there is some 
evidence that first borns, on the average, 
are lower than later borns in selí-esteem. 
Stotland and Dunn (1963) do comment, 
however, on the relationship between self- 
esteem and responsiveness: 


The finding that persons high in self-esteem [pre- 
sumably mainly later borns—our insertion] етра 
thize more than those who are low [presumably 
mainly first borns] suggest that highs have les 
need to be concerned with themselves and can 
‘lose themselves’ more in other people. ... Those 
high in self-esteem are not as influenced in their 
self-evaluation as the lows are.... The lows may 
be so concerned with their self-evaluation that 
they react to the experiences of others primarily ® 
terms of its implication to themselves [p. 539]. 


So much (for now) for self-evaluation. 
How may we account for birth-order di 
ferences in anxiety and anxiety reduction 


we are treading on mueh more speeu- 
ground, and the following comments 
ШЧ be construed as representing little 
than hunch and guesswork. 
here are three principal questions that 
| to be faced here: (a) Why do first 
become more anxious than do later 
borns when confronting an anxiety-arous- 
situation (as Gerard & Rabbie, 1961; 
Sehachter, 1959; and others have shown)? 
‚ (5) Why does being with others in an un- 
Structured situation raise anxiety for first 
boms, but result in essentially no change 
for later borns? (c) Why do later borns, 
it not first borns, especially like anxiety 
cers? Because our answers will neces- 
|у be speculative, they will be brief. 
As to the first question, we would suggest 
1 because anxiety reduction is more 
likely to be mediated by other 
(especially parents) for them (Schachter, 
4959), first borns become less capable to 
е with anxiety-provoking situations 
h on their own. Later borns, presuma- 
have more often to handle anxiety 
tions by themselves and thus are likely 
develop more effective means to deal 
th them. We are proposing, obviously, 
t the greater dependence of first borns 
On others renders them less competent in 
these situations. 
Turing now to the second question, one 
Might make the following argument: be- 
Cause of their (hypothesized) relative in- 
 *Xpertise in coping with anxiety, first borns 
fre highly responsive (possibly overrespon- 
Sive) to situations which are potentially 
‘dangerous, This responsiveness leads them 
to be especially sensitive to the cues others 
give off concerning their emotional state. 
being with others—despite the fact that 
f in the past have mediated anxiety 
E for them—might serve only to 
heighten a first born’s sense of his own in- 
effectiveness, Zimbardo and Formica 
(1963), in fact, suggest that under the 
| typical anxiety-arousing conditions of aí- 
"lation experiments, there is an ability 
dimension involved. Specifically: “S asks 
е question: “Сап I take whatever the 
| &Xperimenter has to give, and ean I do this 
better than the others’ [p. 160]." Following 
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our reasoning, we would have to that 
first borns should be expected ad often 
to answer in the negative. 

Before tackling the last Question, we 
must pause to consider the conflicting And. 
ings of the Wrightsman experiment. И will 
be recalled (see p. 15) that Wrightsman 
found that waiting with others actually 
reduced anxiety for first borns, While one 
could point to such obvious differences be. 
tween the two experiments аз initial level 
of anxiety (in Wrighteman's experiment it 
was lower), subject composition (Wrights 
man used both males and females; sex dif- 
ferences are denied, but no data are pre- 
sented), the nature of the waiting-period 
conversation (uncontrolled in Wrights- 
man's experiment), ete., none of these fac- 
tors would in themselves seem to account 
for the discrepancies. Even in the control 
conditions which are most nearly compara- 
ble to his, the birth-order trends of the two 
studies are still quite divergent, While it ix 
true that our Ns in these conditions are 
smaller than his and that none of our con- 
trol-group differences is actually statisti- 
cally significant, that is not so for the ex- 
perimental conditions (combined) where 
the differences between the two experiments 
are indeed most marked. 

Clearly these discrepant findings are still 
in need of a satisfactory explanation. Since 
it makes a considerable difference, in terms 
of the theory developed in this paper, 
whether the presence of others is anxiety 
arousing or anxiety reducing for first borns, 
we urgently need further investigations 
which study the consequences of affiliation 
for first borns and later borns and relate 
them to the conditions under which affilia- 
tion occurs. 

Finally, why are anxiety reducers espe- 
cially liked by later borns, while first borns 
show no preference between anxiety raisers 
and anxiety reducers? The most parsimoni- 
ous explanation for the data from this ex- 
periment is simply to say that, while first 
borns sometimes declined in anxiety (more 
exactly, increased in calmness), none of 
them ever became very calm (see p. 17): 
that is, their anxiety wasn't reduced enough 
to make a difference. Why a first born 
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should be so resistant to anxiety reduction 
we have tried to indieate in the preceding 
pages. 

These speculations concerning birth-or- 
der differences in anxiety and anxiety re- 
duction have (possibly as their only merit) 
testable consequences and need to be as- 
sessed before we can incorporate them into 
a theory concerned with birth-order differ- 
ences and social behavior. 


Some Unsettled Issues 


We shall conclude with a few comments 
on three important issues whose implica- 
tions we have heretofore considered only by 
allusion, if at all. These comments may 
be taken as indicating some possible direc- 
tions for further research in this area. 

First, there is the question of the gen- 
erality of the self-evaluative need. We 
have repeatedly found evidence against a 
generality hypothesis even within the lim- 
ited range of characteristics studied in this 
investigation. Yet we have indicated that 
there is some reason to question this inter- 
pretation% ê and to consider the matter still 
an open question.® Certainly the theory 
presented would imply that the need should 
not necessarily be specific only to certain 
evaluable aspects of oneself (e.g., ability 
factors), but rather should be operative 
with respect to many domains of a person’s 
make-up. The evidence from the literature 
on this point, while suggestive, is far too 
fragmentary to warrant any conclusion, 
even a tentative one. 

Another issue concerns a possible interac- 
tion between sex and birth-order variables. 
Our theory is meant to apply to persons of 
either sex, yet all our Ss were females. May 
we glibly assume that the sex of a person 
contributes only an insignificant proportion 


*One way to investigate the generality of the 
self-evaluative need within the area of emotional- 
ity would be to reanalyze the results of the 
Schachter and Singer (1962) experiment dealing 
with determinants of emotional state according 
to the ordinal position of the Ss. Since different 
emotional states were manipulated in their study 
(viz, euphoria and anger), we would have to pre- 
dict that, if self-evaluative needs are generalized 
over different emotional states, first borns should 
show a greater tendency to emulate the behavior 
of the (emotional) stooge than later borns. 
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to the variance of behavior stemming from 
self-evaluative needs? 

Although the data are not at all con- 
sistent on this point, there have been re- 
ports of some Sex X Birth Order interac- 
tions. For example, Gerard and Rabbie 
(1961) found the usual relationship among 
birth order, anxiety arousal, and affiliation 
for females but quite the opposite for males 
(i.e., later-born males appeared more anx- 
ious and showed stronger affiliative tenden- 
cies). The same Sex X Birth Order inter- 
aetion was found for affiliative tendeneies 
in a subsequent experiment by Gerard 
(1963). In an unpublished study by Farina, 
Ring, and Weller (1963) ,!° the same tend- 
encies were revealed through questionnaire 
responses. For example, in reply to the 
question: “When you are upset, do you 
tend to: (a) go off by yourself or (b) seek 
out other people,” we found that while 
significantly more (p < .05) of the first- 
born females said they would affiliate 
(68% to 47%), just the reverse was true 
for males (28% for first borns, 46% for 
later borns). Comparing affiliative tenden- 
cies of first-born males with first-born fe- 
males (28% versus 68%), the difference was 
significant at beyond the .0005 level by 2 
chi-square test. 

Plausible explanations for such Sex X 
Birth Order interactions have been offered 
(eg., Gerard & Rabbie, 1961), but it is 
difficult to evaluate them until the condi- 
tions under which these interactions take 
place ean be specified (for some investi- 
gators, including Schachter, find parallel 
results for both sexes). Our own suggestion 
would be that while tendencies toward self- 
evaluation should be the same for both 
sexes, there may be certain circumstances 
where other competing motivations take 
precedence for one sex but not for the 
other (e.g., possibly first-born males are 
more reluctant to show fear in front 0 
others and thus seek not to affiliate undet 
such conditions). The research question 15, 
of course, what motivations and under 
which conditions? 

Finally, we would make a plea: almost 


* Unpublished study entitled “Some correlates 
of ordinal positions"; available from author. 
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all investigators seem agreed that birth- 
order differences reflect certain constants in 
the childhood experiences of first borns and 
later borns, and yet there has been a curi- 
ous disinclination, among social psycholo- 
gists at least, to study children in this re- 
gard (so far we must apply this stricture to 
ourselves as well). It seems evident, how- 
ever, that we shall soon reach a theoretical 
impasse without this type of research. The 
most speculative parts of our theory, to 
take one example, are speculative precisely 
because we lack the specific information on 
children that we need. It seems to us, then, 
that further theoretical advances in this 
intriguing area hinge most closely on di- 
reet, theoretically oriented investigations of 
children, both within and outside of the 
laboratory. If this sounds as if we are say- 
ing that social psychologists need to be- 
come child psychologists as well, that is 
probably because that is exactly what we 
are saying. 


SuMMARY 


This investigation was conducted to test 
the hypothesis that first-born individuals 
have a greater need for both self-evaluation 
and anxiety reduction than later borns. 
The experimental design was a 2 x 2 x 2 
factorial with (a) opportunity for anxiety 
reduction and (b) ease of self-evaluation, 
the experimentally manipulated variables, 
and (c) ordinal position of birth the third 
variable. Only female Ss were run. An ex- 
perimental session involved one naive S 
and two experimental confederates. S was 
told that she and the other girls would 
shortly be taking part in an experiment 
concerned with “auditory stimulation,” but 
that it would first be necessary that every- 
body fill out a “mood” questionnaire pur- 


portedly relevant to the experiment soon to 
begin. On a pretext, E soon thereafter left 
the room for 5 minutes following which the 
two confederates conversed in such a way 
as to suggest their emotional state (high or 
low in anxiety) and their degree of simi- 
larity of emotional state (high or low). 
After this 5-minute interlude, Æ returned, 
administered some additional question- 
naires, and then informed the naive S of the 
true purpose of the experiment. (There was 
no “auditory stimulation” experiment.) 

The major findings of the experiment 
may be summarized as follows: 

1. Being with others in an ambiguous, 
possibly dangerous, situation is anxiety 
arousing for first borns, not so for later 
borns. * 

2. In general, changes in the emotional 
state of first borns seem to depend on the 
nature of the social environment to a much 
greater extent than is true for later borns; 
specifically, first borns show stronger tend- 
encies than later borns to converge toward 
the emotional level of others. 

3. First borns tend to like those who are 
similar (on an emotionality dimension) to 
themselves, whereas later borns tend to like 
those who decrease their anxiety level. 

4. First borns are less confident about 
their self-ratings on emotionality dimen- 
sions; their self-ratings also show greater 
fluctuations than those of later borns. 

The major results of the experiment were 
interpreted as reflecting a greater need for 
self-evaluation on the part of first borns, 
but a greater need for anxiety reduction in 
the ease of later borns. A developmental 
theory which concerns itself with motiva- 
tion and social behavior in relation to birth 
order and which attempts to integrate find- 
ings from a number of studies was pre- 
sented. 
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PERCEPTION OF IMPENDING COLLISION: 
A STUDY OF VISUALLY DIRECTED AVOIDANT BEHAVIOR! 


WILLIAM SCHIFF 
The City College of the City University of New York 


Theoretical issues and empirical evidence concerned with the perception and 
avoidance of impending collision were discussed. A theoretical framework was 
developed, based on J. J. Gibson’s (1950, 1958, 1959) concepts of ecological 
optics and stimulus information. A series of experiments was performed with 
invertebrate and vertebrate Ss; several stimulus variables were manipulated, 
and several hypotheses derived from the theoretical framework were tested, 
It was found that most animals respond avoidantly and directionally to the 
abstract visual stimulus property of accelerated magnification of a dark form 
in the field of view, which specifies the approach of an object and impending 
collision. Such behavior was found to be relatively independent of shape and 
magnification rate (with some exceptions) and is apparently not a product 


of associative learning in some species. 


I. order to survive in a world of natural 
enemies and other kinds of potentially 
dangerous approaching objects, man and 
lower animals must hide from, escape 
from, or otherwise avoid these environ- 
mental sources of danger. Although several 
sense modalities may provide information 
leading to appropriate behaviors when 
such events occur, the visual modality is 
one of the more important means of reg- 
istering warning stimuli for many animal 
Species. This fact leads to questions of 
What visual stimuli actually elicit avoidant 
behaviors in various animals, and how such 
stimuli come to elicit avoidant behaviors. 
The first set of problems concerns the na- 
ture of the visual stimuli involved, and 
the second set of problems may initially 
be concerned with whether the avoidant 
behaviors are products of learning. Part 1 
——— 


` These investigations were conducted and re- 
Ported in a different form in partial fulfillment of 
the requirements for the PhD degree, at Cornell 
niversity. They were supported by a Public 
Health Service Fellowship (MPM-16, 172) from 
the National Institute of Mental Health. Addi- 
tional support was received from Contract NONR 
101 (14) between Cornell University and the Office 
of Naval Research and from the Department of 
Sychology of Cornell University. 
y The author gratefully acknowledges the sugges- 
ons and encouragement of James J. Gibson in 
Connection with these studies. 


of this paper deals with the first set of 
problems, and Part 2 deals with the second 
set of problems. 


Part 1 


Previous work has produced several hy- 
potheses as to the nature of visual stimuli 
eliciting avoidant behaviors in response to 
approaching objects, but few experimental 
tests of the hypotheses. Gibson (1958) dis- 
cussed what might be called a “family” of 
spatiotemporal transformations of stimuli 
specifying a wide-ranging set of relative 
approach and avoidant relationships be- 
tween animals and environmental objects. 
Some of these have been subjected to pre- 
liminary investigation. Carel (1961) re- 
ported that human Ss (subjects) had some 
success in judging "time-to-collison"—a 
concept developed by Purdy (1958). Carel's 
Ss viewed an optieal representation of а 
surface moving toward them. Schiff, Cavi- 
ness, and Gibson (1962) found that adult 
and infant rhesus monkeys “ducked” or 
withdrew abruptly in response to an optical 
representation of а rapidly approaching ob- 
ject. But there has been no systematie 
study of these visually specified approach 
relationships, either across species, or 
within and across stimulus dimensions; 
nor have the stimuli involved been sub- 
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jected to a more detailed conceptual analy- 
sis, although the perceptual correlates of 
such stimuli have been generally known 
since Hillebrand (1894). 

The optical stimuli provided by an ap- 
proaching object may be described in the 
abstract as follows. A two-dimensional 
spatial change (expansion or contraction) 
is ordinarily perceived by a human O (ob- 
server) as a spatial change in the third 
dimension. The event may be concep- 
tualized on a projection plane (Gibson, 
1950), but naturally occurs on a sensory 
surface. When a rigid object (A) ap- 
proaches a station point (О), the visual 
angle (о) subtended by the object’s con- 
tours—or any two points on its surface— 
increases as distance (D) decredses and 
time (T) passes. Any part of an optical 
pattern being so magnified may impart 
the information of impending collision, 
including time-to-collision. Such a spatio- 
temporal event may act as a “higher 
order" stimulus (Gibson, 1959) for per- 
ception and/or action. 

When the approaching object’s velocity 
is approximately constant (or increasing), 
the latter portion of the optical size change 
becomes extremely rapid, since “... the 
relative rate of change of the angular sepa- 
ration between any pair of... points gives 
the reciprocal of the time-to-go. [Carel, 
1961, p. 21]." The angular size of the object 
increases at a geometric rate rather than 
a constant rate, as the object approaches 
an O (Schiff, 1964, pp. 34-35). Such an 
“optical explosion" would occur as the form 
begins to fill the entire visual field—about 
180° in most vertebrates. This explosively 
accelerated portion of a magnification 
event may be said to specify imminent 
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Fic. 1. А mathematico-optical diagram of the 
approach of an object to a station point. 


collision and corresponds roughly to what 
has been termed “looming” in previous 
work (Gibson, 1958; Schiff, 1964; Schiff ef 
al., 1962). 

One may conceive of a “family” of visual 
stimuli specifying an object’s approach, in- 
cluding family members specifying ap- 
proach at constant velocity, constantly 
changing velocities, and nonconstantly 
changing or irregular velocities. There are 
also family members having the property 
of radially symmetrical magnification, 
which specifies head-on approach and a 
collision course. Optically then, impending 
collision is specified by radially symmetri- 
cal magnification of a contour (or con- 
tours) in a field of view, with the rate of | 
magnification being such that velocity will 
not be canceled out by a decreasing rate 
of approach. The visual stimulus specify- 
ing impending collision is thus an invariant 
or nonchanging property which may be 
common to several members of the larger 
family of stimuli specifying approach—in- 
cluding noncollision. Approach to an ob- 
ject usually involves an additional stimu 
lus component, that of centrifugal flow of 
optical texture outside the edges of a form; 
being optically magnified. The difference 
between approach to and of an object be 
comes optically ambiguous in only a few 
special cases, as when a form is magnified 
against a uniform untextured background, 
e.g., a clear sky. 

The family of stimuli containing the 
above-mentioned invariant also contame 
additional properties which may commun! 
cate further information about an object 
Such information might conceivably b 
used by men and animals to identify the 
potential danger or desirability of an ap 
proaching object or animal, and the mos 
appropriate mode of response to it, €& 
escape, duck, dodge, ete. These propertié 
include: (a) Time-to-collision (Purdy, 
1958, p. 68)—specified by rate of chang 
of angular size with time. (b) Shape © 
object—specified by contour. (c) Solidity 
of objeet—specified by continuity of SW 
face texture. (d) Path of approach—spe™| 
fied by symmetry of magnification. If these 
properties are in fact utilized by some ani- 
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mals in governing responses to visually 
perceived approaching objects, one should 
be able to observe members of a family of 
avoidant and protective responses when 
animals are presented with appropriate 
visual displays. The particular mode of 
response elicited by a stimulus specifying 
impending collision (e.g., fleeing, dodging, 
flinching, ete.) would depend on several 
factors in addition to visual ones, includ- 
ing species characteristics and abilities, 
stage of individual development, char- 
acteristic ecology, and situational factors, 
eg., body orientation. These relationships 
might be formulated into a complex higher 
order S-O-R concept, in which invariants 
and variants of stimulation correspond to 
invariants and variants of response; the 
response invariant to rapid approach being 
avoidant behavior per se, and response 
variants including the previously discussed 
modes of avoidant or protective behavior. 
Whereas response invariants might be ex- 
pected to hold across species and across 
members of a family of stimuli, response 
variants might be species-specific, or even 
situation-specific. There is a great deal of 
potential information in even the abstract 
stimuli under consideration, for a “single” 
magnification event may convey informa- 
tion concerning space, time, motion, path 
of motion, and object qualities. 

The foregoing theoretical considerations 
were stimulated by Gibson’s recent theo- 
retical views, and lead to several predic- 
tions, general and specific. 

1. Optical magnification specifying a 
dangerous impending collision will elicit 
avoidant and protective behavior signifi- 
cantly beyond any elicited by control stim- 
uli having similar properties, except those 
Properties specifying approach or im- 
Pending collision (e.g, minification, which 
Specifies recession). 

2. Since greater rates of magnification 
Specify more rapidly approaching objects, 
‘nd hence more imminent danger, they 
Will elicit more abrupt responses, re- 
Sponses of greater magnitude, or a greater 
Proportion of responses than less rapid 
Tates of magnification when less rapid rates 
Produce less than 100% responses. 


3. Since the approach of objects having 
certain shapes (e.g., jagged edges) is ordi- 
narily more dangerous than the approach 
of smooth objects for many animals, mag- 
nification of such dangerous shapes will 
produce more avoidant responses, or re- 
sponses of greater magnitude than will 
magnification of less dangerous shapes. 

4. Since collision between an animal and 
a solid surface is often more biologically 
destruetive than collision with a nonsolid 
surface, magnification of forms having an 
apparently continuous surface will pro- 
duce more avoidant responses, or responses 
of greater magnitude than will magnifica- 
tion of forms having discontinuous sur- 
faces. | 

5. Sinee path of approach is specified 
in the degree of symmetry of a form under- 
going magnification, the direction of avoid- 
ant responses will vary as some function 
of the apparent path of approach, as speci- 
fied in the relative symmetry or rate of 
“skew” of a form undergoing magnifica- 
tion. 


Experiment 1 


Although there has been research con- 
cerning the shape of visually perceived 
forms and avoidant behaviors, little has 
borne directly on the concepts being tested 
here. Tinbergen (1951) reported that shape 
in relation to direction of transverse over- 
head movement produced spontaneous 
flight-fear responses in certain gallinaceous 
birds. Various attempts to confirm this 
finding (e.g., Hirsch, Lindley, & Tolman, 
1955; Melzak, 1961; Rockett, 1955) failed 
to find shape of any importance, although 
a moving shadow per se tended to elicit 
rapidly habituating avoidant responses. 
Clark (1935) reported avoidant responses 
to overhead moving stripes and approach- 
ing square forms in the fiddler crab Uca 
pugnax, attempting an explanation in 
terms of simple taxic mechanisms. How- 
ever, in view of Hebb’s later finding that 
chimpanzees displayed spontaneous fear- 
withdrawal reactions to a variety of mov- 
ing and motionless shapes (Hebb, 1946), 
a taxic response explanation seems inade- 
quate when higher species are included. 
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Smith (1951, 1952) found that shape, two- 
or three-dimensionality, and meaningful- 
ness of form (e.g, a baseball) were un- 
important for humans' discrimination of 
"radial motion" at rather slow rates of ap- 
proach and recession. 

The aim of the present study is that of 
investigating avoidant responses in ani- 
mals of several species having rather dif- 
ferent visual systems and response habits 
—manipulating rate of magnification, con- 
tour shape, and surface continuity, while 
presenting stimuli specifying impending 
collision. The various shapes were em- 
ployed to test Predictions 3 and 4, and to 
explore certain concepts developed in previ- 
ous physiological studies of visual systems 
(e.g, Hubel & Wiesel, 1962; Rosenblith, 
1961). The variety of shapes and mag- 
nifieation rates used was also a means of 
testing the concept of an invariant stimu- 
lus—independent of shape and magnifica- 
tion-rate variations. 


Method 


Subjects. The following species were used as 
subjects: 

1. Fiddler crabs. Fourteen crabs (Uca pugnas) 
captured on mud flats and creek beds served as 
Ss? They varied in size from a maximum of 2 in. 
along the lateral body axis, to a minimum of about 
5 in. along the same dimension. The smallest were 
quite young, but were otherwise of unknown age. 
Five were males and seven were females. They 
were housed individually in plastic bowls contain- 
ing damp mud and water. 

2. Frogs. Twenty-four male Rana pipiens aver- 
aging 2.5 in. in body length were obtained from a 
biological supply house. They were housed indi- 
vidually in glass bowls containing water. 

3. Chicks. Twelve domestic Kimber Chiks were 
hatched in the laboratory from commercially in- 
cubated eggs. They were housed in wire cages in 
groups of six, and received room illumination 15 
hr./day after hatching. Three were males and nine 
were females. When first tested they ranged from 
3 to 10 days old. 

4. Humans, Seven paid volunteers also served 
as Ss. Of these three were adult men, three were 
adult women, and one was a 5-yr.-old girl. Two of 
the adults were aware of the nature of the experi- 
ment. 

Apparatus. General description: The apparatus 
may be conveniently described in three general 


"The author wishes to thank Alan Jones for 
supplying these Ss. 


track 
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Fic. 2. Schematic side view of general apparatus. 


components. (a) Shadow-casting device for pro- 
jecting visual stimuli. (b) Projection screen onto 
which the stimuli are projected for viewing. (c) 
Response table (for subhuman Ss), or a reduction 


screen (for human Ss). 
The shadow-caster rode on a steel track set 
perpendicular to the center of the translucent pro- 
jection sereen, which was in turn viewed by S from 
the opposite side. Figure 2 shows this arrangement. 
The shape of the projected silhouettes was manip- 
ulated by inserting any of several forms in a car- 
rier, which was propelled along the track by an 
electric motor. As the form approached the point- 
source lamp, the shadow underwent a continuously 
accelerated magnification; conversely, as the form 
moved away from the lamp, the shadow underwent 
a continuously deaccelerated minification. The size 
changes corresponded to those produced by ai 
object moving towards or away from S at a con- 
stant velocity. In both cases, these size changes 
reached a practical limit—screen size 6 ft, OF 
about 80° of visual angle, in the case of magnifi- 
cation, and about 4 in., or about 4° of visual angle, 
in the case of minification. Although it was pos 
sible to perceive the visual events as expansion OF 
contraction of a form in two dimensions, virtually 
all human Os have reported the event as approach 
and recession of an object in depth—the object re- 
maining a constant size. 
Specific description: The shadow-caster traveled 
a distance of 3.5 ft. along the track. The combina 
tion of pulley sizes and gearing used provid 
travel times—and hence magnification or minifica- 
tion times—of 1.00 sec, 1.75 sec., 2.25 кес, and 4.00 
sec. The carrier was pulled along the track by à 
nylon-coated stranded-steel wire attached at both 
ends of the carrier. Microswitches automatically 
shut off the motor when the carrier reached either 
end of the track. The electric motor was a Dayto? 
gear-head type, 1/15 hp. rated at 5000 rpm, wit 
a 52:1 reduction ratio. The track was swiveled 2 
the end distal to the point-source so as to provi i 
| 


*The author is grateful to James A. Cavines 
for his work on the prototype version of this ki 
paratus, and to Robert B. Peters and Neil Аий" 
not for their work in the development and €?" 


struction of this shadow-caster. 
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Fic. 3. Side view of shadow-caster. 


sired. The shadow- 
ide view in Fig- 


skewed approach paths if d 
casting apparatus is shown in а 
ure 3. 
, Bhadow-casting forms included several two- 
dimensional shapes, including a circle, a square, a 
agged eight-point “nonsense” shape, and a four- 
ded figure having concave sides. They were of 
Approximately equal diameter, and approximately 
equal area (9 sq. in.). They were constructed of 
5 in. plastic sheets painted opaque black. The 
discontinuous two-dimensional forms included a 
3 in. x 3 in. fence-wire form having .5-in.-square 
Openings, a clear plastic sheet of the same size 
having nine .5-in. opaque spots on it, and an 
Ene perimeter .5 in, wide and 4 in. in diameter, 
EE а clear center. In addition to these two- 
mensional forms, a rubber ball 225 in. in diame- 
Ei evite carpenters' nails protruding from all 
25 5, and a toy Volkswagen sedan 6.5 in. long and 
*9 m. in width and height, provided three-dimen- 
sional silhouettes—having apparent depth within 
them when in motion, were also used. Finally, in 
Sia to provide a stimulus pattern in which the 
шеша brightness сопїтазї relationship was 
x oen a form consisting of an aperture 15 in. 
m diameter in an opaque shield 27 in. wide and 16 
ne high was employed. When this form was moved 
Ward the point-source lamp from 1.5 ft. away, 


magnification of a bright c gainst a dark 
background was obtained. To assure that no 
unique factor of this more rapid rate of magnifica- 
tion would be responsible for any differential re- 
sults obtained, a circular opaque shadow-caster 
the same size as the aperture was used as a special 
control, with the same limited travel distance. A 
magnification or minification stimulus produced 
by the reverse-contrast form thus had an exact 
mathematical replica, but with reversed figure- 
ground brightness contrast. 

The projection lamp was a 25-watt concentrated 
are point-source lamp, serviced by a D.C. Power 
Supply Converter. This provided polar projection. 

The projection screen was translucent plastic 
measuring 6 ft. square. It was prac ly grainless 
in appearance, and its light transmission properties 
were such that in the experiments reported, the 
measured brightness of the illuminated portion of 
the sereen was 0.85 ftL. (footlamberts), and that 
of the shadowed portion was 0.035 ftL.; a ratio of 
about 24:1. 

The only source of illumination in the experi- 
mental room was a 7.5-watt lamp mounted atop 
the wooden frame holding the projection screen. 
This provided adequate illumination for observing 
Ss with а minimum of contrast reduction on the 
screen. 


The lower half of the sereen was covered by а 
Masonite pegboard shield on the nonviewing side. 
This provided a textured "ground" without any 
cues of linear perspective from the projected 
shadow of the track, Opaque tape was used to 
square up the track shadow. The fact that the 
holes in the board were of equal size and spaced 
at equal intervals eliminated gradients of texture 
size and density. 

For subhuman species, an open-field response 
table was used for observing avoidant behavior. 
This apparatus consisted of a 4-ft-square glass 
table 3 ft. high, and having a 6-in.-high glass re- 
taining wall around the table surface. A circular 
plywood dise 7 in. in diameter was mounted flush 
at the center of the table's surface. Animals were 
placed on this disc to view the display at a dis- 
tance of 3.5 ft. from the center of the screen. The 
table was marked with calibrated radii which ra- 
diated from the disc at 30° intervals. Thus loco- 
motor responses could be recorded in terms of di- 
rection and distance. An animal running directly 
away from the screen was moving at 180°. 

For human Ss, a reduction screen and a peep- 
hole viewer were substituted for the response table. 
The reduction screen measured 50 in. high and 55 
in, wide and was mounted 20 in. off the floor. This 
obscured the projection screen frame and the re- 
mainder of the room from S. The display was 
viewed through a peephole 1 in. in diameter, while 
S sat on an adjustable stool. The peephole was 
situated at the same station point used with the 
response table. A cardboard flap on the reduction 
screen was lowered between trials to obscure the 
view of the display. A GSR (galvanic skin re- 
sponse) device was also used (see Appendix). 


Procedure 


Subhuman Ss. Previous pilot studies 
showed no evidence for rapid habituation 
effects with repeated presentation of stim- 
uli, no interaction between successive 
changes in shape and magnification rate. 
Therefore these conditions were assigned 
randomly, except where otherwise indi- 
cated. One restriction on random ordering 
was that half the Ss of each species re- 
ceived magnification first in a series and 
half received minification first. A further 
exception to the general procedure was 
that frogs received only one magnification 
rate each—half with continuous surfaced 
forms and half with discontinuous sur- 
faced forms. 

Since some animals flinched slightly to 
turning off the overhead lights and since 
sudden darkening is a confounded feature 
of magnification of a shadow silhouette, 
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animals were placed on the response table 
and exposed to 10 trials of turning off the 
overhead lights. This procedure effectively 
habituated flinching to sudden darken- 
ing per se. Previous research — (Schiff, 
Caviness, & Gibson, 1962) has shown little 
or no flinching behavior to the confounded 
stimulus property of a sudden size increase 
per se, but such size increase not specify- 
ing approach (expansion, but not magnifi- 
cation at a geometrically accelerated rate), 

Animals were then presented with five 
magnification trials alternated with five 
minification trials, for each shape and each 
magnification rate. Except for the frogs, all 
animals were exposed to all forms and all 
rates, but not to all combinations of shape 
and rate. Trials were spaced about 15 
sec. apart, and only 20 trials were ad- 
ministered to each animal each day. 

If the animals moved more than 1 ft. 
away from the center of the table, they 
were returned to the center. Otherwise Ss 
were not touched between trials. 

Human Ss. The adults were seated be- 
fore the peephole and fitted with finger 
electrodes (see Appendix). After habituat- 
ing their GSR to the apparatus noise, Ss 
were instructed to look through the peep- 
hole. Each S received a total of six magni- 
fication trials alternated with six minifica- 
tion trials, with each of two magnification 
rates (1.75 sec. and 4.00 sec.). The square, 
the auto, and the “mace” silhouettes werê 
all used with each S, twice each at each 
rate. Order-of-silhouette presentation was 
completely counterbalanced, and order-of- 
magnification rate was partially counter 
balanced for any given silhouette. Half the 
Ss received magnification first throughout 
the series, 

The intertrial interval within a pair of 
magnification-minification trials was abo 
10 sec., and about 2 min. elapsed between 
each rate or silhouette change. 

The Ss were questioned about the na 
ture of their percepts. After the first pair 
of exposures to each silhouette, Ss wer 
asked to describe what they had seem 
what kind of object, what it was doing, ! 
anything, motion, if any, and direction Y 
motion, if any. After each rate change 


asked whether the perceived objeet 
peity, if апу, was the same, less than, 
Pater than what they had seen previ- 


he 5-yr.-old 5 was given an identical 
of trials, except that the GSR was 
hitted, and she was closely observed for 
blink and head-withdrawal from the 
peephole. 


Fiddler Crabs. These Ss all responded 
differentially to magnification and minifi- 
stimuli. Avoidant responses oc- 
ed to 64% of the magnification trials 
to 0.6% of the minification trials, This 
rence was shown to be significant by a 
test of significance between correlated pro- 
tions (p < .01). The data were then 
irther analyzed in categories of continu- 
versus discontinuous surfaced silhou- 
s. Each S was assigned as a score the 
portion of responses to each type of 
rm. A Wilcoxon matched-pairs signed- 
inks test (Siegel, 1956) showed this differ- 
e to be significant also (t = 1, p < .01). 
Further analysis of the differential shape 
lata was not attempted, since these figures 
did not vary significantly—by inspection. 
‘The same was true for the response mag- 
nitude data and for proportion of response 
to different magnification rates. Inspection 
Í the qualitative data, however, indieated 
responses to the more rapid magnifica- 
rates were almost universally "more 
pt" than to the slower rates. The re- 
sed-contrast magnification condition 
duced only 10% avoidant responses— 
lot significantly in excess of responses to 
t c: minification condition (by in- 
ction). 
Figure 4 shows the direction and mag- 
tude of locomotor responses to magnifica- 
on and minification stimuli. Inspection 
of these data reveals that responses oc- 
Curred in many directions, but were gen- 
rally directed away from the screen and 
om the apparent path of approach. Not 
| behavior was locomotor. In addition to 
: g, flinching or flattening out were 
frequently observed. Running behavior took 
0 general forms: a continuous run—most 
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Fic. 4. Locomotor response termination points 
of M crabs to magnification and minification 
stimuli. 


evident in adult animals—and an inter- 
rupted run, that is, a short run followed 
by flinching or flattening out in the last 
milliseconds before the termination of the 
stimulus event. The latter behavior was 
observed most frequently in younger ani- 
mals. Responses occurred in virtually all 
body orientations, with flinching being 
most frequent when the animal faced the 
screen. 

Frogs. None of the frogs responded to 
lights-off trials. In contrast to the con- 
sistency of the crabs’ responses, there 
seemed to be definite “responders” and 
“nonresponders” among the 24 animals 
tested. Table 1 reveals that 17 of the 24 
animals accounted for all the responses 
and shows the mean distance jumped to 
magnification of different shapes at differ- 
ent rates. The total distance jumped to all 
magnification stimuli was 886 in. as com- 
pared to 22 in. jumped to minification 
stimuli. The proportion of magnification 
stimuli eliciting responses was .38, while 
the corresponding figure for minification 
was .04. An overall test between correlated 
proportions of responses to magnification 
and minification was significant (p < .01). 
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TABLE 1 
Mean Distance JUMPED TO MAGNIFICATION AND MINIFICATION OF DIFFERENT 


Snares AND Rares (FROGS) 


Shape 
Rate s 
a ә * * m 8 n ш 
ду 1 12.6" 6.0" 
2 0.0 0.0 
3 0.0 0.0 
4 7.3 0.0 
5 5.8 0.0 
6 1.2 0.0 
1.75 
7 14.6 0.0 
8 13.6 1.5 
9 12.1 0.0 
10 > 0.0 0.0 
11 11.0 0.0 
12 11.2 0.0 
2.25 
13 0.0 0.0 
14 9.0 0.0 Y 
15 0.0 1 
16 9.5 0.0 
17 8.0 0.0 
18 7.7 0.0 
4.0 
19 1.5 1.0 
20 0.0 0.0 
21 6.0 1.0 
22 14.7 0.0 
23 0.0 0.0 
24 2.0 0.0 


Since the overall test proved significant, 
further analyses of the data seemed ap- 
propriate. A Wilcoxon matched-pairs 
signed-ranks test showed the difference be- 
tween distances jumped to continuous 
versus discontinuous silhouettes to be sig- 
nificant (t = 0, p < .001). The five con- 
tinuous silhouettes were then further 
analyzed, using the Kruskal-Wallis H test 
—a nonparametric analysis of variance 
(Siegel, 1956, pp. 184-194). The result did 
not permit rejection of the null hypothesis 
(Н = 44, p > .05). The difference in 
magnitude of response to discontinuous 
silhouettes was so slight that analysis was 
not attempted. 

Due to the fact that no systematic effects 
due to shape had appeared beyond the 
continuous-discontinuous dichotomy, it was 
decided to pool the data, even though this 


involved possible confounding of shape 
and rate effects. An H test of the rate 
data proved insignificant however, ak- 
though the figures were suggestive of pos 
sible differential effects (Н = 7.2, р? 
.05, < .10 when corrected for tied ranks). 
The means for the different rates—from 
the slowest to the fastest—were 3.8 M: 
5.7 in., 9.8 in., and 6.5 in. The difference 
were in the predicted direction, with the 
exception of the fastest rate—which was 
displaced one position from what was PI 
dicted. 

No jumping occurred in response to mag 
nification or minification of the reversed 
contrast silhouette. 

As with the crabs, in 
in virtually all body orientations а, 
all directions away from the screen ^ 
latter data are shown in Figure 5. It W^ 
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Fic. 5. Locomotor response termination points 
of fiógs to magnification and minification stimuli. 


observed that when Ss were facing the 
Screen during a magnification trial, they 
tended to flinch, or to reorient themselves 
away from the sereen and then jump, all 
in а rapid sequence. Unlike crabs, frogs 
cannot locomote backwards, which may be 
related to the reorienting behavior. 

Chicks, Like the crabs, some of the chicks 
initially responded to the lights-off trials 
with flinching responses. The chicks re- 
Sponded with avoidant behavior to 78% 
of the magnification trials and 5% of the 
Minification trials. This difference was 
shown significant by a test between cor- 
telated proportions (p < .01). As with the 
crabs, response magnitude and proportion 
to the different shapes was so similar that 
Statistical analysis was not attempted. 
With these Ss however, even the continu- 
Ous-discontinuous dichotomy did not ap- 
bear to have any differential effects. Table 
2 shows the proportion and magnitude of 
avoidant responses to magnification stim- 
uli, pooling the most- and least-rapid 
tates to maximize differential results and 
for the sake of economy. Even this group- 
mg of data failed to produce a significant 
effect when magnitude data were subjected 
toa sign test (p > .05). 


As with the other species tested, the 
chicks did not respond to the reversed-con- 
trast silhouette in either magnification or 
minification conditions. 

The responses took the forms of flinch- 
ing or locomotion in most cases. However, 
crouching, “back-pedaling,” and hopping 
were also observed. The crouching was an 
immobile squatting, with the animal's 
ventral surface in contact with the glass 
table. This behavior has been previously 
classified as a fear response in this species 
and related species (Daanje, 1950; Hirsch 
et al., 1955; Schaller & Emlen, 1961). 

Responses occurred in virtually all body 
orientations, so long as the screen was in 
the animal’s field of view. Figure 6 shows 
that responses occurred in directions away 
from the the apparent motion path, with 
but a few exceptions. The majority of re- 
sponses were directed away from the 
sereen (180°) and were of relatively small 
magnitude compared to the responses of 
crabs and frogs. 

Humans. The quantitative GSR data 
proved statistically nonsignificant, and 
are presented in the Appendix. 

The qualitative data were more in line 
with expectations than were the quantita- 
tive data. With only one exception, the 
adult Ss reported all stimulus events as 
approach and recession of an object (of 


TABLE 2 
PROPORTION OF MAGNIFICATION STIMULI Pro- 
pucING RESPONSES AND MEAN DISTANCE OF 
LOCOMOTOR RESPONSES TO MAGNIFICATION 
(Cutcks) 


1.0 and 1.75 sec, rates 2.25 and 4.0 sec. rates 


M 
distance 
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Fic. 6. Locomotor response termination points of 
chicks to magnification and minification stimuli. 


constant size) in depth. The exception 
occurred with a nonnaive S, who reported 
one percept of two-dimensional contrac- 
tion on one minifieation trial at the 1.75 
sec. rate. The 5-yr.-old S accurately re- 
ported approach and recession on every 
trial, as did the older Ss. The child was 
also observed to blink and/or withdraw 
her head abruptly on all magnification 
trials, but never during minification trials. 

The objects represented by silhouettes 
were named quite accurately in most cases 
—naive Ss being somewhat more variable 
in their interpretations. Two naive Ss re- 
ported the square form as “the back of a 
truck” (they had just been on a long auto 
trip). The child reported the mace as “a 
doll’s head.” All Ss accurately identified 
the auto, and most described the mace as 
either a ball of string or putty, having 
nails in it. 

In every case the Ss accurately reported 
the relative velocity of the stimulus events 
as “faster” or “slower.” When these results 
were subjected to a sign test, the results 
proved significant (p < .02). 


Discussion 


д The various results of this study have 
indieated that magnifieation containing the 


invariant property specifying impending 
collision produces avoidant behaviors in 
fiddler crabs, frogs, and domestic chicks, 
Behavioral results of exploratory studies 
with domestic kittens (see Appendix) and 
human Ss have proved equivocal, although 
verbal reports from human Ss tend to sup- 
port all hypotheses but that regarding 
shape and perceived danger. The control 
stimulus—minifieation—produced only a 
few (probably spurious) avoidant re- 
sponses in the species tested. 

So long as the stimulus included a rela- 
tively continuous shadow pattern under- 
going magnification, responses occurred 
with some regularity in most species 
tested; more regularly with crabs and 
chicks than with frogs. It seems that the 
effective stimulus for these responses was 
not shape-specific, since a wide range of 
shapes did not significantly alter the 
probability or magnitude of these re- 
sponses. 

Discontinuous surfaced forms tended 
to fail in eliciting avoidant behavior in 
some species—especially the crabs and 
frogs. It is interesting to note that both 
these species are amphibious, and it may 
be that this discriminatory characteristic 
is due to differences in the sensitivity of 
visual or response systems which have de- 
veloped in the course of adaptation to 
amphibious environments, There is little 
resemblance between the compound eye of 
the crabs and the frog’s visual system; yet 
their behaviors in respect to discontinuous 
surfaced silhouettes were similar. 

Due to the fact that all species tested 
failed to respond to the reversed-contrast 
configuration, further delimitation and 
identification of the sufficient stimulus for 
avoidant responses to approaching objects 
seems called for. The relationship of 
brightness contrast between “closed con- 
tour" and “field of view” is not symmetri- 
cal. In the species tested, certainly, the 
“figure” being magnified must be somewhat 
darker than the background lying outside 
its boundaries. Several explanations 10 
this phenomenon are possible, but the fol- 
lowing seems likely—if somewhat spet 
ulative. Since approaching objects almos 
always reflect less light than their terram" 
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sky backgrounds, evolutionary adapta- 
tion, physiological mechanisms, or both, 
may have led to rather direct specification 
of figure-ground brightness relationships in 
these lower species. Although the effective 
stimulus is invariant across many other 
transformations (e.g., shape), figure-ground 
contrast is not an effective one in these 
species. If such is the case, it would be 
expected that certain species—which char- 
acteristically avoid luminous approaching 
objects—would be sensitive to even this 
abstract transformation. 

A physiological mechanism which could 
possibly account for the above-mentioned 
facts would be one encompassing the no- 

tion of rapidly accelerated stimulation of 
continuously more peripheral receptor units 
of the eye, as a physiological correlate for 
the perception of approach, and in the 
case of impending collision—symmetrically 
continuous toward the periphery. These 
stimuli might  directly—or indirectly 

(through pereeption) release avoidant or 

proteetive behaviors Barlow and Hill 

(1963) have discovered concentrie “оп” 
and "off" zones in the rabbit’s retina, which 
are especially sensitive to centrifugal and 
centripetal motion. Certain amendments to 
their conceptualizations would have to be 
made, since these and similar receptive 
fields discovered in other species have 
been found to be invariantly responsive to 
changes in contrast. Hartline’s finding— 
that the frog’s “off” units discharge when 
à shadow is moved towards the center of 
the receptive field, but not when bright 
Spots are moved in the same direction (Bar- 
low & Hill, 1963, p. 412)—seems to indi- 
cate that physiological mechanisms may 
be responsible for the failure of animals to 
respond to reversed-contrast stimuli. Lett- 
Yin, Maturana, McCulloch, and Pitts’ 
Group III moving or changing contrast 
detectors found in the frog’s visual sys- 
tem (Rosenblith, 1961) may also be in- 
Volved; and Hubel and Wiesel (1962) have 
found similar units farther back in the 
tat’s visual system. 

The failure of the discontinuous silhou- 
"ies to produce avoidant responses in 
crabs and frogs may be cited as further 
‘vidence of a similar finding by Lettvin 


et al. (1959, p. 1945), in which checked or 
dotted patterns moved across the receptive 
fields of frogs, produced little or no re- 
sponse. 

The fact that similar results were ob- 
tained with convex, concave, and straight 
dark boundaries, seems to indicate that 
Lettvin et al's Group II units (Rosen- 
blith, 1961) are not specifically involved 
in whatever physiological mechanisms 
underlie responses to magnification, since 
Group II units are receptive only to mov- 
ing, dark, convex boundaries, and not to 
concave or straight boundaries. 

Since responses occurred regardless of 
body orientation and eye orientation, the 
theoretical notion that a stimulus invari- 
ant is yesponsible for the percepts and/or 
responses seems further supported. Theo- 
retical conceptualizations such as Gibson’s, 
and the physiological investigators’ cited, 
lean heavily upon the notion of invariants 
—in stimulus configurations and response 
systems. 

The hypothesis that differentially shaped 
and differentially dangerous objects would 
lead to different avoidant responses was 
not supported. It may be that the measures 
used were too gross to show such differ- 
ences, or that no such differences exist. So 
long as a relatively solid object is specified 
along with imminent or impending collision, 
the nature of the object seems to make 
little behavioral difference. 

The hypothesis that more rapid rates of 
magnification should produce avoidant be- 
havior observably different from that pro- 
duced by less rapid rates, received some 
support in the case of frogs, and qualita- 
tive support in the case of fiddler crabs 
and chicks, both of which seemed to re- 
spond more abruptly in the more rapid 
magnification events. The evidence con- 
cerning human responses to different rates 
of magnification was clear-cut in the case 
of verbal reports, but equivocal in the 
case of the GSR measure. 

There were several kinds of related 
avoidant responses in the observed be- 
havior. All speeies tested manifested some 
form of “flinching,” and all but the kittens 
and humans manifested locomotor be- 
havior as well. Thus the concept of “re- 
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glance. Such a fact 
tamination of “no response” and 
tive response of "freezing," which 
specifically noted in frogs by ( 
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to the magnification stim- 


tional level, related to either a simple 
difference in response strength, or to pro- 
tective habits or “instinctual” responses 
displayed in predator attacks. Small 
crabs flattened into the mud may have a 
better chance of escaping predators in the 
last moments before “collision.” Large 
crabs’ superior running speed and greater 
detectability may make their best chance 
for eseape a final dash for their burrows. 
While this is speculative evolutionary 
argument at best, it may help to bring 
more order to the rather diverse facts 
discovered in this study. 


Experiment 2 


Since the first experiment demonstrated 
the general effectiveness of magnification 
of a relatively solid form (darker than its 
background) in eliciting avoidant be- 
haviors, it seemed desirable to determine 
further limits of such stimuli so as to 
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specify as accurately as possible the 
stimulus properties producing avoidant be. 
haviors. Fiddler crabe and chicks wem 
used as Ss since their behavior had proved 
more reliable than other species tested 


Method 

Subjects. Three adult fiddler crabs and two $ 
wk.old chicks used in the previous experiment 
served as Se, 

Apparatus The same shadow-casting derie 
used previously was used in this study. Only eo 
tinuous-surfaced forms were used 

An RT (reaction time) key attached to за 
Esterline Angus event recorder was used to record 
response times of Ss in relation to stimulus evest 
times. This combination was accurate to 25 me. 


Procedure 


Animals were tested individually. They 
were exposed to a series of 10 magnifica- 
tion trials alternated with 10 minification 
trials, at each of the four magnification 
rates, Silhouettes were selected randomly, 
with the stipulation that each shape wat 
used an equal number of times with each & 

At the start of each magnification event, 
E (experimenter) simultaneously pressed 
the RT key, making a mark on the moving 
record. The key was released when E sa¥ 
S begin an avoidant response, providing à 
record of the point during the stimulus 
event when the response occurred. The Ё% 
RT was later subtracted from these meas 


ures. 

The null hypothesis was that 
would be no nonchance differences betwee 
the response times during the stimulw 
events of different rates. 


Results 


Table 3 shows the mean and modal R 
for the fiddler crabs (Ss 1, 2, 3) 
chicks (Ss 1, 2). The RTs were ge 
consistent in both species, with somew! 
more variation occurring at the slow# 
magnification rates. The modal RTs we™ 
treated as scores in each of the fou 
conditions and were tested for significant? 
using a Friedman Two-Way Analysis 
Variance for related samples (Sie 
1956, pp. 166-173). The resulting x ^ 
15, p < 01,3 df, allowing rejection of th 
null hypothesis. In addition to this 
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TABLE 3 


Mxax axo Mopar Resroxss Trees or Tussa Visecan Caise 
Two Cimcus то Dirresesr Maosiricerton Haves =ч 


SS сыын a с сыы ы 2—2 _ — 
EI iN iP- = 
Г uae ¥ xD TF „т ГИ м: 
97 1.0 1.5 1.0 20 ГЕ] за з 
1.0 1.0 1.0 1.0 1.97 1.5 130 ГЕ 
9 1.0 1.00 1.20 i= 2. 2% ГЕ" 
ю 10 144 120 и 20 so i» 
92 1.0 1.53 1.0 2% 23 3.0 T 
Md. stands for modal response time. “выя == | 
response times were in close corre- This would also account for the fact that 
hdence with theoretical collision times. avoidant behavior begun at а time 


was 
e appeared a tendency for responses proportional to the time remaining before 
оесиг earlier in the temporal course of collision would have occurred. 

stimulus events, as response times were ; 

у proportional to total magnification Experiment 3 


А the previous experiment, and to evaluate 
юп the tenability of the formulation of a eriti- 
he purpose of this experiment was to cal threshold for the appearance of avoid- 
mine the approximate threshold or ant responses in the two species tested, а 


hit oí the effective stimulus eliciting further study was carried out. The object 
dant behavior. Since the stimulus of the present st was | 

med “looming” specifies imminent col. magnification prior to the “looming” phase 
lon, опе might expect that it is this of a magnification event is necessary for 


as magnification time in- Subjects. Three adult fddler crabs and three 
s (and magnification rate becomes E tad UM to aqui ud 
Ore gradual), the critical threshold is оболоп previously, but were 
ached at an increasing time interval nsed in Experiment 2 


€, the recording error inherent in the ese Маранын war inb the 
paratus used would account for the lack : 3 : 
ап interval between model RT and 
retical collision time with the most 
bid rate— 1.0 sec. indi 
The results may be interpreted as indi- » cedure 
Etat these uM 
d utilize the information of time- 2 xxxi А 
jon (Purdy, IS. But an eren don Mimi тнэ рена! тет: 
t B oes eo oem ойон ed with each of the two magnification rates, 
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both to and from the partial magnification 
points. Trials were spaced about 15 sec. 
apart. The study was carried out over a 
7-day period, so that each S received 
only 20 trials/day—at one threshold point. 
The occurrence and magnitude of avoid- 
ant responses were recorded by E. 


TABLE 4 


Results 


Table 4 shows the proportion of 1 
ant responses and corresponding mean dis- 
tance of locomotion for each S in each 
condition. Quantitative data from the two | 
magnification conditions were almost iden. | 
tical and hence were combined. Inspection 


PROPORTION OF PARTIAL MAGNIFICATION STIMULI PRODUCING RESPONSES, AND MEAN 
Distance or LOCOMOTOR RESPONSES IN CRABS AND CHICKS 
15° 20° 25° 
: Proportion M pane Proportion M Dir ance Proportion м Руле 
Magnification up to 
Crabs 
1 0.0 “0:07 0.0 0.0” 0.2 1.0" 
2 0.1 1.0 0.0 0.0 0.0 0.0 
3 0.0 0.0 0.1 0.0 0.2 2.0 
Chicks 
1 0.0 0.0 0.0 0.0 0.0 0.0 
2 0.0 0.0 0.1 0.0 0.2 1.0 
3 0.0 0.0 0.2 0.0 0.1 0.0 
Magnification from 
Crabs 
1 0.8 2.0 0.8 3.5 0.8 2.0 | 
2 1.0 3.5 0.8 1.5 0.6 1.5 
3 0.6 6.0 1.0 2.5 1.0 3.5 
Chicks 
1 0.8 2.5 0.6 2.0 0.8 4.5 
2 0.8 0.0 0.8 0.0 0.8 1.5 
3 1.0 2.5 1.0 2.0 0.6 2.0 
: 30* 35° 40° 45° 
Proportion M Distance Proportion M Distance Proportion M Distance Proportion M Distance 
Magnification up to 
Crabs 
1 0.1 0.0" 0.4 1.5" 0.8 2.0" 0.8 2.5" 
2 0.0 0.0 0.2 0.0 0.4 2.0 0.8 4.5 
3 0.4 1.5 0.4 2.0 1.0 2.5 0.8 4.0 
Chicks 
1 0.2 0.0 0.4 1.5 0.8 2.0 ‚0.8 1.0 
2 0.4 1.0 0.8 1.5 0.8 1.5 1.0 1.0 
3 0.4 1.5 0.4 2.0 0.6 2.5 0.8 2.5 
Magnifieation from 
Crabs 
1 0.4 1.0 0.6 1.0 0.8 1.0 0.8 0.5 | 
2 0.6 0.0 0.6 3.0 0.8 0.5 0.6 0.0 
3 0.8 1.0 0.8 6.5 0.4 4.0 0.6 0.0 
Chicks 
1 0.8 4.5 0.6 1.5 0.4 1.0 0.6 0.5 
2 0.8 1.5 0.8 1.0 0.8 0.0 0.4 0.0 
8 0.6 2.0 0.8 4.0 0.6 1.0 0.6 0.0 | 
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TABLE 5 
Summary OF ANALYSES OF RESPONSES TO DIFFER- 
ING Decrees OF PARTIAL MAGNIFICATION 
(Crabs and Chicks) 


Partial magnifica- Partial magnifica- 
tion up to stopping tion from stopping 
points points 


Proportion xf = 26.0,°°6 xf = 16.8,* 6 


df df 
M dist хт? = 31.9,°°6 x? = 16.9,* 6 
df df 
*p s 01. 
** p < .001. 


of the table reveals that the threshold for 
responses to stimuli specifying imminent 
collision is not a sharp one—the case with 
most so-called thresholds—but that mag- 
nifieation up to 25° of visual angle pro- 
duced few avoidant responses and responses 
of relatively small magnitude. With mag- 
nification reaching 30°-35° of visual angle, 
definite (although truncated) avoidant 
responses appeared. Magnification from 
40° or so tended to produce avoidant be- 
havior frequently. This behavior was 
abrupt in nature, involving little locomo- 
tion. A Friedman Two-Way Analysis of 
Variance for related measures was used to 
test both distance of locomotion measures 
and proportion of stimuli producing re- 
sponses measures. Table 5 shows a break- 
down of these analyses. Inspection of these 
figures reveals that magnification up to a 
stopping point led to more consistent 
differences in behavior than did magnifica- 
Чоп from initially magnified silhouettes— 
With both behavioral indexes. 

The only qualitative difference between 
responses to the final portions of mag- 
nifieation events and the full length events 
Was some tendency for responses to the 
former to be rather abrupt, as if the 
animals were suddenly startled. 


Discussion 


. The fact that thresholds were relatively 
independent of magnification rate over 
the range tested seems to indicate that 
the threshold for responses to looming is 
Magnification beyond approximately 30° of 


` visual angle. Magnification beyond this 


point is apparently a sufficient condition 
for the appearance of avoidant responses 
in these species, and possibly other species 
as well. The fact that starting magnifica- 
tion from 40° of visual angle or more pro- 
duced truncated responses suggests that 
magnification up to 25° or 30° may fune- 
tion as an attention-getting or orienting 
stimulus requiring further magnification to 
produce complete escape or avoidant re- 
sponses. Such partial responses support the 
classification of these behaviors as “taxic” 
—after Lorenz and Tinbergen (1957, pp. 
177-178) —falling under the “law of hetero- 
geneous summation” or Reizsummenphüno- 
men (Lorenz, 1957, p. 261). 

Whether the threshold for avoidant re- 
sponses to looming fluctuates greatly as 
some function of magnification rate is an 
open question awaiting further research. 
Although the present study did not in- 
corporate enough variation in magnifica- 
tion rate to settle this question, it might 
be expected that when magnification be- 
comes sufficiently slow so that it portends 
contact, but not a painful collision, avoid- 
ant responses would give way to other be- 
haviors, such as fending-off, catching, or 
approach responses. 

Von Uexkiill (1934, pp. 21-29) developed 
the concept of the “farthest plane” of vis- 
ual space within an animal’s Umwelt, using 
distance as an indicator of such a be- 
havioral nexus. However, the present study 
suggests that visual angle would be a better 
indicator. Animals probably tend to flee 
sooner from a large object than from a 
small object approaching at equal dis- 
tances and velocities. Von Uexkiill (1934) 
recognized this when he stated: 


It is hard to decide where the farthest plane be- 
gins in the Umwelt of an animal, for it is difficult 
to determine experimentally at what point an 
approaching object in his environment becomes 
nearer as well as larger in his specific world. [p. 27] 


This issue may be a psuedo issue for the 
behaviorist. But the present finding—that 
for these two quite different species 25*— 
35° of visual angle subtended by a form 
undergoing rapid accelerated magnifica- 
tion is a sort of threshold for avoidant 
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behavior—seems a partial answer to the 
questions asked years ago by von Uexkiill. 
It also seems to demonstrate a promising 
method for further determining such limits. 


Experiment 4 


The notion of path of approach as an 
informative stimulus dimension has re- 
ceived little experimental treatment and 
only limited theoretical treatment. Koffka 
(1935, pp. 370-372, 645) gave the notion 
some attention, although he never followed 
up his formulations with experimental 
work. Gibson (1958) also discussed the 
problem while attempting to specify the 
stimulus conditions leading to visually di- 
rected avoidant behavior in situations 
where collision is to be avoided, Clark 
(1935), while studying the visual acuity 
of the fiddler erab Uca pugnaz, reported 
findings bearing on the present experiment. 
He suggested that the crab simply moves 
away from the eye being (most) stimulated 
by optieal motion. The problem with this 
interpretation is that it also leads to the 
expectation that symmetrical simultaneous 
stimulation of both eyes leads to no move- 
ment—whieh proved false in previous 
studies reported in this paper. For example, 
Clark (1935) states: 


It seems that a moving object stimulating the 
right eye (irrespective of direction) will cause 
the crab to move to the left, while stimulation 
of the left eve will cause the crab to move to the 
right. If this is true, then a pattern plate made up 
of stripes in passing over a fiddler-crab would of 
necessity stimulate both eyes, and tend to cause 
movements in opposite directions. It may well be 
that under such conditions the stimuli inhibit 
each other with the result that no response ap- 
pears. [pp. 312-313]. 


Although this problem is an interesting and 
complex one, a simpler question is asked 
in this experiment: Can these animals 
utilize the information of "skew" in mag- 
nifieation, whieh specifies path of ap- 
proach? More simply, can these animals 
uülize the information in stimulation 
specifying “hit or miss" in adaptive guid- 
ing of locomotor behavior? 


Method 


Subjects. Ten fiddler crabs used in Experiment 
1 were also used in this study. Five were males 
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and five were females. Several weeks elapsed be. 
tween Experiment 1 and the present study. 

Apparatus. One silhouette (the square) and one 
magnification rate (4.0 sec.) were used in this 
study. The shadow-caster track was set skewed 
to the side so that the projected silhouette passed 
off the screen just as the carrier stopped in magni- 
fication trials. On minification trials, an edge of 
the silhouette appeared at the edge of the screen, 
then the whole silhouette appeared and “receded” 
toward the stopping point at the center of the 
screen. To a human O this event appears as an 
approaching object “aimed” to the side of O, so 
that it appears to be on a noncollision course, 30° 
from dead center. The minification event appears 
to be an object coming from behind O and to the 
side, aimed for a point dead ahead of O. 


Procedure 


The animals were individually placed on 
the response table, with care being taken 
to assure that all body orientations (one 
for each 30° radius line on the table) were 
represented. After this placement, animals 
were not moved unless they left the 1-it. 
limit set in previous experiments. Although 
not all body orientations were used ап 
equal number of times, all were used 
several times, as the crabs reoriented 
themselves frequently. 

Six of the animals were presented with 
six magnifieation trials alternated with six 
minifieation trials, skewed to each side oí 
the screen, that is, 12 to the 90° side and 
12 to the 270° side. The four remaining 
animals received only half as many trials. 
Thus there was a total of 96 magnification 
trials, and as many minification trials— 
48 of each skewed to either side. The side 
of skew, and whether a magnification trial 
or a minification trial was presented first, 
were counterbalanced. 

Notations were made of body orienta- 
tion, direction of response, distance, and 
qualitative aspects of the responses. 

The null hypothesis was that there 
would be no nonchance difference in direc- 
tion of responses with reference to direction 
of path of approach. 


Results 


On the 96 magnification trials, there wel? 
76 locomotor responses—39 when skew 
was to the left, and 37 when skew was to 
the right. On the 96 minification trials 
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there were 23 locomotor responses—11 
when recession was from the left side and 
12 when recession was from the right side. 
The probability of such a differential re- 
sponse frequency’s being due to chance is 
quite small (p < .01). 

More relevant to this particular study 
was the fact that of the 99 locomotor re- 
sponses, 88% were in quadrants opposite 
to the side of skew and 12% were on the 
same side as the skew. This differential 
also differs significantly from the split ex- 
pected by chance alone (p < .01). Both 
the above differences were evaluated by 
approximations of binomial probabilities. 

Figures 7 and 8 are graphic representa- 
tions of magnitudes and directions of re- 
sponses to skewed magnification and mini- 
fication trials. The points represent the 
terminal loci of responses, where animals 
halted for more than about .5 sec. The data 
were quite symmetrical in that response 
vectors to either side were similar. 

Most of the responses to minification 
occurred in the initial stage of the event, 
when the silhouette edge passed onto the 
side of the screen. 

The highly directional responses shown 


path 


mag. * 
min. e 
Fic. 7. Locomotor response termination points 
Jof fiddler crabs to skewed magnification and mini- 
fication stimuli. 


180° 
key 
mag. * 


min. e 


Fic. 8. Locomotor response termination points 
of fiddler crabs to skewed magnification and mini- 
fication stimuli. 


in Figures 7 and 8 occurred at varying 
body orientations. There appeared a tend- 
ency for animals to move in their natural 
direction—sideways in crabs. If an ani- 
mal “faced” 0° and skew was to the 270° 
side, the response was usually directed to 
90°. If, however, the animal faced 30°, the 
response was usually directed to 150° when 
skew was to the 270° side. 

Both younger and older animals re- 
sponded with locomotion as well as with 
other members of the family of avoidant 
behaviors, including flinching and flatten- 
ing out. Younger animals tended to run, 
and then stop and flatten out or freeze in 
the last milliseconds before the silhouette 
passed off the screen. Older animals tended 
to reverse their direction momentarily dur- 
ing this same stage of the skewed magnifi- 
cation event, and then stop. This variety of 
behavior has been called “jumping” (Clark, 
1935). 


Discussion 

The present study supports the notion 
that animals may pick up and utilize the 
information of path of approach in guiding 
locomotion. Even the invertebrate species 
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used in this study appeared to respond 
adaptively to direction of skew. 

Since animals often ran to either side, 
even when both eyes were presented stim- 
uli of equivalent intensity, there was no 
direct support for Clark's (1935, pp. 312- 
313) hypothesis stated above. 

There was no greater tendency for 
younger animals to run directly in the path 
of motion, as might have been expected 
from Koffka's (1935, p. 372) formulation. 
A glance at Figures 7 and 8 will reveal 
that if response vectors were drawn, they 
would form approximate right angles with 
the apparent path of motion. This implies 
that even very young animals may dodge 
and thereby avoid rapidly approaching 
objects. Whether or not this behavior is a 
produet of early learning remains a ques- 
tion for further research. 

The results of this experiment provide 
support for Gibson's notion of the effective 
stimuli utilized by animals in avoiding 
collision during locomotion (Gibson, 1958, 
p. 188). 

From the faet that locomotor responses 
were apparently independent of body- and 
eye-orientation, it may be assumed that 
the absolute locus of ommatidial stimula- 
tion plays no important role in guiding 
these behaviors. 


Part 2 


It has been known for some time that an 
increase in the size of a “retinal image" is, 
roughly, an abstraet visual event corre- 
sponding to perceived approach of an ob- 
ject. Traditional views in psychology have 
held that this stimulus alone is mot suffi- 
cient to aecount for the perception of ap- 
proaeh, nor for avoidant responses fol- 
lowing such stimulation or perceptions. 
Empiristie psychologists from von Helm- 
holtz (1925) to transactionalism (e.g. It- 
telson, 1951, pp. 197-198; Kilpatrick, 1961, 
p. 46) have resorted to some form of past 
experience in their explanations of the per- 
ception and avoidanee of approaching ob- 
jects. These views have early philosophi- 
cal origins, the most thorough of which is 
Bishop Berkeley's (1957, pp. 39-81). 

Other accounts of such avoidant behav- 


ior utilize the assumption that learning 
must play a crucial role in the develop- 
ment of these behaviors, and they com- 
bine the empiristic assumption with some 
variation of conditioning theory to cover 
the specifies of how men and animals must 
learn to perceive and avoid approaching 
objects apprehended visually. That such 
views are speculative, having little empir- 
ical support, does not prevent their pur- 
veyance as fact—eg., see Vernon (1962, 
p. 28). 

Essential to any of the empiristie posi- 
tions regarding these phenomena is the 
operational condition of contiguous oceur- 
rence of visual stimuli produced by an ap- 
proaching object, and a somatic compo- 
nent—touch or pain. In most cases, the 
association is assumed rather than ob- 
served. Such an untested “association hy- 
pothesis” has been criticized in the past, 
expecially by Koffka (1935, pp. 371-372, 
p. 645), McDougall (1960, pp. 34-35), and 
more recently, by a number of ethologists. 
Such rather different views may be boiled 
down to the objection that simply because 
avoidant behavior can be conditioned to 
many stimuli, this is no demonstration 
that this is how the natural process occurs; 
nor is it evidence that conditioning is the 
basis for all meaningful perception and 
avoidant behavior. 


Experiment 5 


The evidence bearing directly on the 
“association hypothesis” regarding the per- 
ception and avoidance of impending col- 
lision is sparse and conflicting. Riesen 
(1950) reported that dark-reared chim- 
panzees did not blink or otherwise mani- 
fest avoidant behavior when objects were 
brought up to the eyes under illumination. 
However, these animals proved to be func 
tionally blind in general. Fishman and 
Tallarico (1961a, 1961b) found that pre- 
maturely hatehed chieks and newly hatched 
chicks manifested avoidant and protective 
behavior (blinking) when an object was 
brought up to the eye. The chicks did not 
react similarly to a condition of size 
crease without approach, which finding Wê 
later confirmed by Schiff et al.'s (1962) 
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monkeys. Both of these findings provide 
evidence that chicks and monkeys were re- 
sponding to perceived approach rather than 
perceived growth, contrary to recent criti- 
eism (Epstein & Park, 1964, p. 193). It is 
true, however, that in the Fishman and 
Tallarico studies there were no controls to 
assure that the size change corresponded 
to the geometrically accelerated size 
change accompanying cases of actual ap- 
proach at a constant velocity; nor was 
there adequate isolation of optical stimu- 
lation, since actual objects were moved 
through the air toward the S's eye. Yet, 
these studies comprise the only remaining 
direct evidence bearing on the specific 
event of visually apprehended approach 
without opportunity for association with 
touch or pain. Other more extensive 
studies of depth perception without op- 
portunity for learning utilized static dis- 
plays (Walk & Gibson, 1961), and others 
dealing with motion perception were not 
concerned with the same issue and came to 
different conclusions concerning motion 
pereeption (Meyers, 1964; Riesen & Aarons, 
1959). 
| Tt therefore seemed appropriate to con- 
duet a study similar to those mentioned 
above, but with better stimulus control, 
and with dark-rearing procedures having 
à minimum of adverse effects regarding 
visual and motor development, while 
still eliminating the possibility of associa- 
tion between visual and contact stimuli. 
The hypothesis was that associative learn- 
ing is not, a necessary condition for avoid- 
ant responses to visual stimuli specifying 
imminent or impending collision. 


E darkening condition with rhesus 


Method 


Subjects. Incubated eggs (Kimber Chiks) were 
obtained in two batches from a commercial hatch- 
ery 1 day prior to hatching. These were hatched 
in the laboratory in heated light-tight cages. 
Twenty of these animals were finally tested, 9 
from one batch of a dozen and 11 from another 
dozen, 

A pilot study using dark-reared kittens was 
also attempted (see Appendix). 

Apparatus. The general apparatus used has 

| been described previously in this paper. АП but 
the most rapid magnification rate were used. 


Special dark-rearing cages were constructed of 
wire, and lined with cardboard inside. Wire sereen 
bottoms were covered with chickstraw, and the 
cages were covered with solid metal tops. Inside 
measurements were 14 in. long, 9 in. wide, and 9 
in. high. Cages were checked for light leaks, and 
as an additional precaution were wrapped in 
heavy black cloth and kept in a darkroom. Heat 
for incubation was provided by a 100-watt bulb 
in a photo reflector, which was inverted on the 
opaque metal top of the cages. 


Procedure 


Dark-rearing. The heating light was 
kept on until the chicks were 1 day old. 
This maintained the temperature at 98° F. 
Ample food and water were kept available 
in the cages, and maintenance was carried 
out in darkness. Due to these animals’ 
rapid development, there was little oppor- 
tunity for visual and motor retardation to 
occur. 

Testing. The first nine dark-reared 
chicks were tested at different ages to de- 
termine a sufficient developmental age for 
observable responses and to allow for a 
sufficient spread in time of first testing, 
while keeping dark-rearing time to a min- 
imum. The first S was tested when 10 min. 
old, the next three Ss were tested when 4 
hr. old, and the last five Ss were tested 
when 1 day old. They were kept in the 
dark-rearing cages except during testing. 

In each testing session, each animal was 
placed at the center of the response table 
on а paper towel. The room lights were 
turned on and off 10 times to habituate 
flinching to “off”—which occurred in some 
animals. The shadow-caster carrier was 
then run up and down the track to check 
for responses to apparatus noise. Then 
magnification and minification trials were 
presented, followed by another noise test. 

Animals received 10 magnification trials 
alternated with 10 minification trials in 
each session. The first five trials were with 
one shape-rate combination, and the sec- 
ond five with another. Trials were spaced 
about 15 sec. apart. Animals were moved 
only if they strayed more than 1 ft. from the 
center of the table. In this case, they were 
grasped from behind, and returned to the 
paper towel. Trials were not begun until the 
S appeared to be looking at the screen. 
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The second group of dark-reared chicks 
(11 Ss) was kept in the dark until they 
were 3 days old, and then tested for the 
first time. АП 20 Ss were divided so that 
half received magnification trials first. The 
Ss were banded íor identification through- 
out the study. 

The null hypothesis was that there 
would be no nonchance differences in fre- 
queney and magnitude of responses to 
magnification versus minification. 


Results 


None of the chicks manifested flinching 
or other avoidant behavior in response to 
the noise test. А few animals flinched in 
response to the lights-off procedure, but 
these responses habituated in two or three 
trials in all cases. Animals typically re- 
mained in one place during these pretests. 

Of the nine animals in the first group, 
only one failed to respond at all to the 
stimuli specifying impending collision 
(magnifieation). There were 45 avoidant 
responses to the 90 magnification trials 
and no avoidant responses to the minifi- 
cation trials. The mean number of avoid- 
ant responses was 5. 

Similar results were obtained with the 
second group of Ss. Two Ss (Ss 19 and 
20) received the reversed-contrast stimu- 
lus, and these were the only Ss failing to 
manifest апу responses to magnification. 
Having found that in previous experiments 
apparently almost all subprimates fail to 
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respond avoidantly to the reversed-con. 
trast stimuli, these two Ss were dropped 
from further statistical treatment. The 
nine remaining chicks manifested 63 
avoidant responses to the 90 magnifica. 
tion trials, and only 2 such responses to 
minification trials. The mean number of 
avoidant responses in this group was 7. 

The proportion of avoidant responses for | 
each S in each group, and the mean mag. 
nitude of locomotor responses per 5 are 
shown in Table 6. The distance run was 
computed for each S by dividing the total 
distance run by the number of trials. 

Since the data from the two groups 
appeared quite similar, they were com- 
bined and tested for significance between 
magnification and minification conditions. 
A Wilcoxon matched-pairs signed-ranks 
test showed the difference to be highly sig- 
nificant (p « .001, N — 17), allowing for 
rejection of the null hypothesis. 

The observed avoidant responses took 
several forms. The younger animals, and 
some of the older animals, usually re- 
mained stationary and withdrew their 
heads back and down. They also fre- 
quently tensed their bodies, and often fell 
back on their hindquarters. Such flinching 
was directly backward when the animals 
faced the screen directly. But when their 
median body plane was toward a parallel 
with the projection screen, they often 
flinched sideways—away from the screen. 
As Table 6 indicates, responses of older 


TABLE 6 


Proportion or MAGNIFICATION STIMULI PRODUCING Responses, AND MEAN DISTANCE 
or LOCOMOTOR RESPONSES то MAGNIFICATION IN Two GROUPS or DARK- 
REARED CHICKS 


Group 1 


Group 2 


S 1st testing 


2nd testing 


3rd testing 1st testing 


Proportion M Distance 


Proportion M Distance 


Proportion M Distance Proportion M Distance 


0.0” 
1.5 
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0.0” 0.2 0.0” 0.5 0.0” 
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animals often involved locomotion. Wing- 
flapping was also frequently observed in 
older animals, as well as hopping and 
back-pedaling. A circular “darting” behav- 
ior was also frequently observed. The spe- 
cific mode of response varied within and 
across Ss. 

To test for successive changes in both 
measures with repeated testing, Wilcoxon 
tests were calculated for the apparently 
greater differenee—that of locomotor dis- 
tance with repeated testing. The difference 
between the first and second testings was 
found insignificant (t = 10, p > .05, 
N = 6); but the difference between the 
second and third testing was found to be 
significant, after halving the p value to 
take multiple comparisons into account 
(t = 1, p = .05, N = 7). However, since 
the figures for Group 2 appear quite simi- 
lar to those of Group 1 at the third testing, 
it is not possible to rule out maturation, 
rather than repeated testing, as the factor 
responsible for the difference found. 

Avoidant responses were manifested to 
all silhouettes with the exception of the 
reversed-contrast silhouette. 

The results using dark-reared kittens 
аге reported in the Appendix. 


Discussion. 


The results indicated that the previously 
discussed “association hypothesis” re- 
garding responses to visually apprehended 
approaching objects is inadequate. Al- 
though results obtained with dark-reared 
ittens were less decisive than those ob- 
tained with dark-reared chicks (see Ap- 
pendix), they also tend in the direction 
of rejection of the empiristie positions. 
Learning in the form of association be- 
tween a visual stimulus, or consequent per- 
ception, and contact or pain is apparently 
not a necessary condition for avoidant be- 
avior in relation to a visually appre- 
hended approaching object, even though it 
may be a sufficient condition. Without any 
Prior associational experience, chicks tend 
to respond appropriately to such events, 
even when the events are represented by 
abstract optical components of “Teal” ap- 
Proach events. 
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In light of these findings, it appears that 
empiristic explanations of avoidant behav- 
ior to visual stimuli are quite inadequate 
as all-inelusive formulations. The theoret- 
ieal views of transactionalists and other 
empirists (eg. Mowrer, 1960, p. 129; 
Solley & Murphy, 1960, pp. 108-109) are 
at best limited formulations. In chicks, at 
any rate, no transactional “assumption,” 
no conditioned fear, was necessary for the 
appearance of appropriate avoidant behav- 
ior to visually specified approach. In the 
language of “nativistic” views, the behav- 
ior might be termed “spontaneously or- 
ganized,” “innately released,” or simply 
“unlearned.” In Gibsonian terminology, 
the stimulus information is “picked up” 
without associative learning. 

The results also support the findings of 
Fishman and Tallarico (1961a, 1961b), 
and to the extent that the findings might 
be used to support views that many ani- 
mals may perceive “depth” innately, the 
findings also confirm the conclusions oí 
Hess (1956), Lashley and Russell (1934), 
Walk and Gibson (1961), and numerous 
others who have found evidence for “in- 
nate” depth perception. 

The fact that a wide range of silhouette 
shapes and a limited sample of magnifi- 
cation rates produced similar responses 
without benefit of associative learning 
further supports the notion suggested 
earlier—that the invariant property of 
“looming” is what makes these stimuli ef- 
fective in eliciting avoidant behavior. 
However, the fact that the reversed-con- 
trast stimulus again failed to produce 
avoidant behavior further suggests that 
the invariant is not mathematically 
“pure,” in that it is not transposable in 
terms of contrast—at least at this phylo- 
genetic level. That the reversed-contrast 
stimulus fails to elicit responses even be- 
fore animals have had visual experience 
further implies that this is a “built in” fea- 
ture of the animals tested. 


Summary 

A series of experiments with visual stim- 
uli containing the invariant property of 
“looming,” which specifies imminent col- 
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lision; and with other related stimulus prop- 
erties has established the following major 
findings: 

1. Stimuli containing the property of 
“looming” or certain variations of it, tend 
to elicit a family of avoidant responses 
in several animal species and in human Ss. 
The inverse of this property (minification, 
as opposed to magnification) does not elicit 
such behavior to a significant degree. 

2. It is the visually “explosive” portion 
of geometrically accelerated magnification 
of a more-or-less continuous silhouette 
(depending on species) which constitutes 
a sufficient condition for such avoidant 
behavior to visually apprehended ap- 
proaching objects. The approximate thresh- 
old for these behaviors is magnification be- 
yond approximately 30° of visual angle. 

3. The shape of the silhouette under- 
going such magnification is relatively un- 
important in some species, so long as the 
figure being magnified is somewhat darker 
than its background. This asymmetrical 
contrast relationship in the stimuli holds 
for the lower animal species tested, al- 
though such may not be the case in pri- 
mates which have greater powers of ab- 
straetion. 

4. Within the limits tested, more rapid 
rates of magnification tend to produce qual- 
itatively more abrupt avoidant responses, 
but response magnitude (locomotion) varies 
little, if at all. 

5. Asymmetrical magnification of a form 
tends to produce locomotion at approxi- 
mate right angles to the apparent path of 
approach in fiddler crabs. This finding, 
coupled with the findings that responses 


of animals of some species are directional 
relative to the apparent path of motion, 
implies that animals may pick up the ip. 
formation of path of approach specified in 
the degree of skew in the magnification 
stimulus. 

6. Association between visual stimuli 
specifying impending collision and somatie 
results of contact or collision (touch or 
pain) is not a necessary condition for the 
appearance of avoidant responses to the 
visual event in chicks. 

There seems to be strong support of the 
author's extension of J. J. Gibson's theo- 
retical views concerning stimulus informa- 
tion and its pickup by the visual systems 
of man and lower animals. There is evi- 
dence that information in light specifying 
an object’s approach and impending col- 
lision, its velocity of approach, its path of 
approach, and possibly its solidity may be 
picked up by animals of several species. 
It is also apparent that some animals 
may pick up and utilize such information 
in guiding behavior, without benefit of as- 
sociative learning. 

There is no support for the hypothesis 
that response magnitude or probability i 
affected by the rate of approach so far as 
the sensitivity of the presently used meas- 
ures could detect. However, there is some 
evidence that response quality may be al 
fected by this variable. There is no support 
for the hypothesis that the relative danger 
of an approaching object, as specified by 
its shape, is a determiner of response mag 
nitude or probability—again so far as the 
measures were able to detect. 

[ 
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APPENDIX 


EXPERIMENT 1 


Five common domestic short-haired kittens 
were raised in the laboratory (see Appendix, 
Experiment 5) and were tested when 6 wk. old, 
using the procedure of Experiment 1. 

One of the five initially flinched to the lights- 
off series. Table A1 shows the proportion of re- 


TABLE A1 
PROPORTION OF MAGNIFICATION AND 
MINIFICATION STIMULI PRODUCING 
Responses (Kittens) 


Magnification — Minification 


1 0.40 0.00 
2 0.40 0.00 
3 0.00 0.00 
4 0.00 0.00 
5 0.10 0.00 


the projection sereen with equal ease. The Ss 
were told that the device was for recording 
changes in eleetrieal skin resistance. The GSR 
device was then set at baseline zero, and the 
apparatus was run repeatedly until GSR to 
apparatus noise and situation ceased. This pro- 
cedure was repeated when magnification rate 
was changed. The GSRs were recorded by E in 
terms of amount and direction of deflection from 
zero. A deflection was counted only if it oc- 
curred between the start of a stimulus event and 
2 sec, after the termination of the stimulus 
event. Slow drifts were not recorded. 

Table A2 shows the GSR deflections as com- 
bined algebraically across all silhouette shapes 
and magnification rates. Apparently positive 
deflections (decrease in skin resistance) tended 
to occur with magnification, and either no de- 
flection or negative deflection with minification. 
This difference did not reach statistical signifi- 
cance, however, when tested with a Wileoxon 


TABLE A2 


ALGEBRAICALLY SUMMED GSRs TO MAGNIFICATION AND MINIFICATION STIMULI AND 
NuMBER oF Pius, Minus, AND ZERO RESPONSES 


Magnification Minification 

E Sum + - 0 Sum + - 0 
1 4.5 3 2 1 0.0 2 2 2 
2 0:6 3 2 1 0.0 0 0 6 
3 9.5 E 1 1 —6.5 0 5 1 
4 —2.0 3 3 0 —0.5 1 2 3 
5 14.0 4 1 1 —5.5 0 0 6 
6 18.0 6 0 0 —5.5 0 5 1 

Totals 23 9 4 3 14 19 


 Sponses to magnification and minification stim- 


uli, with the 1.75-sec. magnification rate. The 
40-вес. rate produced no avoidant responses to 
either magnifieation or minification stimuli. 
There were no responses to reversed-contrast 
stimuli at either rate. Due to the small number 
of Ss, and the fact that only three animals re- 


- Sponded to any of the stimuli used, no attempt 


Was made to further analyze the data. 

A GSR device (Serco Electronics Psycho- 
galvanometer) was used with the human Ss, ex- 
cept for the young child. This was a simple 
device calibrated —4 to +4 through a baseline 
zero. Any response exceeding the calibration 
points was recorded +5. The device utilized 
finger electrodes and was concealed behind the 
reduction sereen, with wires and electrodes ex- 
tending under this shield. The E sat at the side 
of the shield, viewing S, the instrument face, and 


matched-pairs signed-ranks test (t = 1, p > 
05). Since all but one difference (S,) appeared 
in the same direction, the small N was probably 
responsible for the failure to demonstrate sig- 


nificance. 
Table A3 shows the breakdown of responses 


TABLE A3 
ALGEBRAICALLY SUMMED GSRs то MAGNIFICA- 
TION OF DIFFERENT SHAPES, AND DIFFERENT 


Rates 
S  VWauto Масе Square 1.75 sec. 4.0 sec. 
1 2.0 0.5 2.0 4.5 0.0 
2 —0.5 0.0 1.0 0.5 0.0 
3 —4.0 5.5 8.0 6.0 3.5 
4 2.5 —4.5 0.0  —0.5 —1.5 
5 10.0 1.0 3.0 9.0 5.0 
6 9.0 6.0 3.0 7.5 10.5 
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to magnification for the various shapes and 
rates. Since the overall test failed to demon- 
strate significance, these data were not analyzed 
further. Although the more rapid magnification 
rates appeared to produce responses of greater 
magnitude with only one exception—thus lend- 
ing support to the hypothesis concerning magni- 
fieation rate—the shape data did not support 
the shape-danger hypothesis, since the square 
produced responses of greater magnitude than 
did the “mace.” 


EXPERIMENT 5 


Five common domestic short-haired kittens 
were obtained (with their mother) when they 
were 18 hr. old. When 5 days old they were 
moved from normal room illumination (their 
eyes still closed) to a light-tight ventilated 
room. Maintenance was accomplished in total 
darkness, or with a shielded pen-flashlight when 
necessary. à 

'There is some evidence that dark-rearing 
kittens several weeks may result in retarded 
depth-pereeption (Walk & Gibson, 1961) or 
retarded motion discrimination (Riesen & 
Aarons, 1959). To prevent such anomalies in 
these slow-developing animals, while still pre- 
venting the possibility of any association be- 
tween visually specified approaching objects 
and contact or pain, the following procedure 
was initiated. 

When the kittens’ eyes were all open (about 
9 days of age), they were randomly divided into 
two procedural groups. Experimental group ani- 
mals were suspended in pouches with only their 
heads protruding and placed in room illumina- 
tion for 1 hr. each day. This arrangement per- 
mitted full vision including motion parallax, and 
partial although restricted movement during 
this time. The animals could not raise their 
paws to their eyes, nor collide with objects, nor 
see themselves touching a supporting surface, 
nor fall upon the ground. The two control ani- 
mals were separated from their mother during 
these l-hr. periods but were permitted free 
movement about the room. All animals were 


TABLE A4 
PROPORTION OF MAGNIFICATION STIMULI 
PRODUCING RESPONSES IN DARK- 
REARED KITTENS 


S Experimental group Control group 
1 0.70 0.30 

2 0.40 0.50 

3 0.00 


allowed free movement in the dark, thus dis. 
couraging motor retardation. This procedure 
was continued for 16 days, i.e until the animals 
were 25 days old. 

АП five kittens were tested first when 26 days 
old. They were placed individually on the re- 
sponse table and presented with noise tests, 
Then each S received 10 magnification trials 
alternated with 10 minification trials. The trials 
were spaced about 15 sec. apart as a minimum, 
However, due to difficulty in getting the kittens 
to orient towards the projection screen, longer 
intervals were frequent. The 1.75-sec. magnifica- 
tion rate was used in all cases, and both square 
and circular shaped silhouettes were used. Two 
of the three experimental group animals re- 
ceived magnification trials first, as did one of 
the two control group animals. The null hy- 
pothesis was that there would be no nonchance 
differences in frequency and magnitude of re- 
sponses to magnification versus minification. 

None of the five kittens responded avoid- 
antly to the noise tests. Two of the three experi- 
mental group animals (no association possible) 
and both control animals (association possible) 
manifested some avoidant behavior to magnifi- 
cation stimuli, but none of either group ге 
sponded avoidantly to minification stimuli. 
Table A4 shows the proportion of magnification 
stimuli responded to by each S. 

The avoidant responses were far from im- 
pressive in magnitude. The kittens drew their 
heads baek weakly, but never locomoted. Even 
these weak responses tended to habituate rather 
quickly. One animal from each group stopped 
responding after the first four magnification 
trials. 

Due to the small N, an adequate statistical 
evaluation of the results was not possible. How- 
ever the probability of the observed difference in 
response frequeney to magnification and minifi- 
cation is about .12, as evaluated by the Binomial 
test. 

Since a modicum of avoidant behavior was 
observed in the kittens, it may be tentatively 
concluded that although this species does not 
seem adversely affected by moderate perio 
of deprivation from light and sensory-motor 
practice, there is still good reason to state that 
there is some unlearned motion perception (an 
probably depth perception) in these animals. 
Perhaps the failure of Riesen & Aarons (1959) 
to find motion discrimination in their kittens 
was a result of their long dark-rearing proc 
dure, or the use of relatively “meaningless” mo 
tions, in the biologieal sense. The present study 
provides admittedly tentative support id 


— — ———— иил A Баа 


a 
——À— ——.— 


fe. 


TURA чт et 


PERCEPTION ОР IMPEXDING COLLISION 


Meyers’ conclusion that dark-reared kittens 
сап perceive visual movement (Meyers, 1964). 

The modified dark-rearing technique used 
appears a useful one for allowing sensory-motor 
development without allowing certain kinds of 


associative experience. However, the fact that 
the animals’ responses were weak may be due to 
lack of adequate maturation, and it is suggested 
that future work along these lines might profit 
by waiting until kittens are at least 4 wk. old. 
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REWARD AND INFORMATION VALUES OF TRIAL OUTCOMES 
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An attempt was made to separate the reward and information values asso- 
ciated with trial outcomes in a paired-associate situation, The anticipation 
procedure was used with a 25-item list. A trial began with a single letter ap- 
pearing on a sereen; then S made 1 of 2 available responses and re- 
ceived 1 of a unique pair of point values associated with that item. The 
number of points received over a series of trials was directly related to the 
subject’s monetary payoff for participation. A noncorrection group was 
shown only the point value associated with the response made on a trial; 
a correction group was always shown the points corresponding to both re- 
sponses. Learning curves in terms of correct: responses (choices of the more 
favorable alternative) per trial exhibited significant differences among the 25 
payoff combinations only for the noncorrection group. The combined latency 
and frequency data favored a conclusion that reward magnitudes influenced 
performance under the different payoff combinations but did not affect rate 
of formation of associations between stimuli and trial outcomes. 


Г the traditional paired-associate learn- depends on these different aspects of the 
ing experiment with anticipation pro- trial outcome, the present study introduces 
cedure, each response by the subject is the variable of differential reward magni- 
followed by a paired presentation of the tude. 
stimulus and correct response for the given The modified experimental situation in- 
item. The paired presentation (reinforce- volves a list of stimuli and two alternative 
ment) provides information to the subject responses. For each stimulus a unique pair 
and also, when it follows a correct response, of reward values is assigned to the two 
may be conceived to function as a reward responses. By analogy with studies of dif- 
(McGeoch & Irion, 1952). To facilitate ferential reward in simple trial and error 
analysis of the way in which learning learning, such as the T maze or the 
Thorndikian verbal learning situation, the 
"The experimental work reported in this paper natural procedure would be to start the 
Was conducted at Indiana University and sup- {trial with the presentation of a stimulus, 
pru in part by Grant G5525 from the eem allow the subject to choose one of the re- 
Манко oui. Аун» of de da ЭМ! sponses, and then to give the reward ap- 
tract 225(73) ME the Office of Naval Re- propriate to the response made. However, 
search and Stanford University and Grant MH- this noncorrection procedure alone would 
6154 from the United States Public Health Service not be adequate to elucidate the effects of 
to Stanford University. differential reward, for any function of the 
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reward as a satisfier or drive reducer is 
inextricably confounded with its function as 
a source of information. 

Since it is of theoretical interest to con- 
ceptualize both the informational and the 
satisfying functions of reward, it is also 
desirable to separate the two functions ex- 
perimentally. The addition of a correction, 
or full information, procedure offers a way 
of obtaining the desired separation. With 
the correction procedure, the subject’s task 
is the same as for the noncorrection, but 
the outcome of a trial consists in displaying 
both of the rewards associated with the 
given stimulus. The subject sees not only the 
reward he received for the response that he 
made, but also the reward that he could have 
obtained if he had made the other response. 
Following a choice of the inferior alterna- 
tive, the subject may implicitly correct his 
response; thus the situation is analogous in 
some respects to T-maze learning under a 
correction procedure. 

If the effects of reward on human learn- 
ing depend primarily on the information 
communicated; we should expect little or 
no difference in learning rate among differ- 
ent reward combinations in the case of the 
correction procedure. However, if the func- 
tion of a reward is primarily that of a 
satisfier or drive reducer, we should expect 
faster learning the greater the reward pro- 
vided for a “correct” response to a given 
item. Also, under the latter hypothesis, 
there are grounds for expecting that with a 
fixed reward value for the higher paying 
alternative, rate of learning should be faster 
the greater the reward differential for the 
two responses to a given item. When the 
inferior alternative is selected and the 
smaller reward received, interpretations of 
extinction in terms of frustration, compet- 
ing responses, and possibly even conditioned 
inhibition would lead to the expectation 
that the decremental effect of such an out- 
come on the low-reward response would be 
greater the larger the differential between 
the reward received and the one available 
for the other response. 

Previous attempts to evaluate the roles of 
information and effect in human learning, 
utilizing more complex learning situations, 


have failed to yield convincing positive 
evidence that a satisfying aftereffect di. 
rectly influences strength of a response (Bit- 
terman, 1956; Hillix & Marx, 1960). We 
hope with our simplified experimental de. 
sign to separate more sharply the effects of 
rewards on learning and on performance. 

With respect to more formal theoretical 
interpretations, we anticipate that at least 
to a first approximation the data obtained 
under both conditions may be handled 
either by the one-element pattern model 
(Bower, 1961) which has been successfully 
applied to numerous two-response paired- 
associate experiments by Bower and others 
or, in the сазе of the noncorrection group, | 
by a suitable modification of this model. For 
the correction condition, it seems quite 
possible that the model will describe the 
data for each of the reward combinations 
separately, the principal point at issue being 
whether the parameter values would vary 
with the reward values assigned to various 
individual items. For the noncorrection 
condition, some new considerations aris 
which will be discussed in a later section of 
the paper. И 

Finally, it appeared likely that, in addi 
tion to the usual record of correct and incor- 
rect responses, data on response times might 
prove especially valuable in the present 
situation. Even if the associative proces 
should prove in some respects invariant 
under differences in reward combinations, 
it would remain a possibility that perform- 
ance, as measured in terms of respons 
speed, might vary systematically with re 
ward magnitude. Also, if the frequency йай 
proved to conform to the all-or-none cor 
ception embodied in the one-element model 
it would be of especial interest to determine 
whether the abrupt changes in state 0 
learning prescribed by that model would 
be accompanied by sharp changes in !* 
sponse speed. 


H 


METHOD 


Subjects. Forty-eight nonpsychology majo 
at Indiana University were run during the summe! 
semester of 1961. All subjects (Ss) had previous | 
been in a probability learning experiment in 
same laboratory and were familiar with the 9” 
paratus, payment procedures for participation, an 
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the numerical range of the point values used as re- 
wards. 

Apparatus. An experimental room contained 
a booth, a projection screen, and an air condi- 
tioner. The booth had two sides and a top, but 
no front or back. 

The booth was supported by a 2 x 4 frame, 
75 in. high and 40 in. wide, and the top, lower 
front, and sides were covered with %4-in. plywood, 
The booth was lined with acoustie tile and painted 
gray. Inside the booth, a platform was mounted 
30 in. from the floor. A panel 30 in. wide and 15 
in. high was placed on the platform sloping way 
from S at a 30° angle to the horizontal plane. 

The panel was painted flat black and con- 
tained, arranged in two columns 7% in. apart, two 
numerical displays, two event lights, and two re- 
sponse buttons. At the top of each column was a 
digital display (Industrial Electronic Engineers 
Inc. Model 10052, lamp 1820) which could illumi- 
nate a %-in.-high digit on a ground-glass lens. 
Each display was located 3 in. from the top of 
the board. An event light (V? in. diameter, milk 
white, Dialeo type 135) was situated 1% in. be- 
low each display, and 2 in. below each event light 
was & response button (1 in. diameter, black 
Microswitch type 1PL1). A handrest board was 
Positioned approximately 7% in. from the re- 
sponse buttons and was on a line centered be- 
tween them. The projection screen stood ap- 
proximately 7 ft. in front of S’s chair. The stimulus 
which appeared on the screen was a single, 10- 
in-high, black letter on a white background. 

The experimenter (E) was in an adjacent room 
which also contained apparatus to present stimuli 
and record response information. An IBM 526 
Summary punch was used to read the stimulus 
information from IBM cards and to punch re- 
sponse information into IBM cards. A series of 
relays and timing devices stored and presented 
impulses at the appropriate intervals and con- 
trolled a random-access slide projector which was 
used to present the stimuli. A small window, 
through which slides were projected, was located 
in the wall behind the booth. Communication 
between the two rooms was provided by an in- 
tercom system. 

When the system was in operation, events 
occurred in the following sequence: Prepunched 
cards containing stimulus and reinforcement in- 
formation were placed in the IBM 526. At a signal 
from the programing unit, stimulus and rein- 
forcement information was read into the storage 
Unit. The storage unit consisted of three banks 
of relays, which correspond to the three columns 
in the data words of the IBM cards into which 
Stimulus and reinforcement information was pre- 
Punched. The relays in the programing unit were 


wired so that a particular digit punched in one 
of these columns caused a relay to be energized 
in the bank corresponding to that column. In 
this way, slides were set up, and reinforcing con- 
tingencies arranged. The slide projector was a 
Sarkes Tarzian Model TSP-6A, which allows ran- 
dom access to any of 50 slides on any trial. The 
programing unit controlled the stimulus exposure 
by means of an electromechanical shutter. An 
electronic timer, constructed by the Electronics 
Department at Indiana University, measured re- 
sponse latency to the nearest millisecond and 
stored both the response event and the response 
latency in a series of cold cathode tubes. On signal 
from the programing unit, the electronic timer 
transferred the response information to the IBM 
526, where it was punched into IBM cards. Finally, 
the storage unit was cleared, the timer was re- 
set, and cards in the 526 were advanced so that 
stimulus and reinforcement information punched 
in the next data word was ready to be read into 
storage. Further details concerning the program- 
ing and recording system are given by Friedman, 
Burke, Cole, Keller, Millward, and Estes (1964). 

Procedure. A list of 25 items was constructed 
for each S by assigning a unique letter of the 
alphabet (the stimulus member of the item) to 
a unique pair of points. All letters were used ex- 
cept the letter I, and the pairs of points were the 
25 possible paired combinations of Numbers 1, 
2, 4, 6, and 8. For each item, one member of the 
pair of points was always associated with the left- 
hand button, and the other member with the 
righthand button. 

Instructions indicated that when each letter 
appeared on the screen 8 was to choose the button 
which would yield more points. All responses 
yielded points, but for 20 of the 25 items one of 
the two responses yielded a higher number of 
points than the other. The total number of points 
accumulated by S was tallied automatically in 
the control room and shown to S at the end of 
the experiment. It was made clear that the pay 
scale increased as the total points received in- 
creased and that E was counting the number of 
points received on each trial. The amount of 
money earned per hour ranged from $125 to 
$200 and was directly related to the number of 
points earned in each session. 

The sequence of events for each trial was as 
follows: (a) Stimulus-response interval: The 
stimulus member appeared on the screen. An in- 
terval of 2.5 sec. was allowed for observing the 
stimulus and responding by pushing one of the 
two response buttons. (b) Outcome interval: The 
digital display above the button which was pushed 
displayed the number of points received. The 
small white event light also was lighted during 


this interval and indicated the side on which 
the response oecurred. The stimulus display re- 
mained on during the time the points were dis- 
played for 1 sec. (c) Intertrial interval: АП lights 
and stimuli went out for 1.75 sec. 

Experimental Design. Two independent groups 
of 21 Ss were run, one with each of the following 
procedures. (а) Correction procedure: For this 
group the number of points associated with each 
of the response buttons was shown during the 
outcome interval. Although 5 saw the values as- 
sociated with both responses he received only 
the points that went with the response that he had 
made. This group received 20 complete presenta- 
tions of the list, that is, 500 trials, in a 50-min. 
session. (b) Noncorreetion procedure: For this 
group only the number of points associated with 
the response that S had made was shown to S 
during the outcome interval. These Ss received 
40 complete presentations of the list; that is, 
1,000 trials, in two 50-min. sessions with a 10- 
min. break between sessions. 

The order of items within each presentation 
of the list was randomized using the Rand tables. 
For each S the assignment of letters to reward 
combinations was accomplished by serambling the 
slides on a table and replacing them in the pro- 
jector magazine. The Ss were run in the order in 
which they arrived in the laboratory, but were 
assigned to groups by means of a random number 
table with the restriction that there be 24 Ss per 
group. In order to eliminate bias due to hand 
preference, the left and right positions of the 
points for each item were counterbalanced within 
each group of Ss. 


RESULTS 


Learning curves plotted in terms of mean 
errors per five-trial block for the noncor- 
rection group are presented in the two 
panels of Figure 1. Each of the curves in 
the figure represents a particular payoff 
combination, with the data for sym- 
metrically equivalent items (e.g., 1-2 and 
2-1) combined. Only the items with unequal 
payoff combinations are included, and for 
each item an “error” is defined as a choice of 
the alternative associated with the smaller 
payoff. Each of the data points represents 
240 observations (5 trials, 2 symmetrical 
combinations, 24 subjects) . 

The most general statement that can 
be made about the ordering of the mean 
error curves for the noncorrection condition 
is that learning is more rapid the greater 
the difference between the point values 
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assigned to the two alternative responses. 
Certain of the curves (e.g., that for the 
6-8 combination) appear to have leveled 
off at error probabilities greater than zero, 
apparently contradicting the assumption 
that human subjects will always learn to 
choose the larger of two alternative re- 
wards uniformly, given sufficient practice. 
Some elucidation of this finding may be 
found in the fact that in a few instances a | 
subject settled on the unfavorable alterna- 
tive to a given item and continued making 
this response consistently, evidently never 
discovering that the other response carried 
a higher payoff. In the great majority of 
cases, subjects learned to prefer the higher 
paying (“correct”) response to an item after 
a relatively small number of trials, Using 
the occurrence of five consecutive choices of 
the favorable alternative as a criterion of 
learning, we obtain estimates of the fre- 
quency with which learning of the correct 
response occurred to each payoff combina- 
tion, as shown in the second column of 
Table 1. The instances in which learning 
curves appear to have leveled off sub- 
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Fig. 1. Learning curves for the noncorrection group. | 
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stantially above zero are precisely those in 
which unusually large numbers of items 
failed to meet the learning criterion; further, 
upon examining the protocols more closely 
in these instances, it is found that for 
three cases in the 1-2 condition, five in the 
2-4 condition, three in the 4-6, and eight in 
the 6-8, the incorrect response was made 
uniformly on all of the last 10 trials of 
the series. In Markovian terminology, the 
probabililistic learning process has ended 
in absorption on the incorrect response, 
and, not surprisingly, the chances of this 
occurring are relatively larger the greater 
the payoff value for the less favorable 
alternative. 

Other statisties for the unequal-point 
items of the nonecorreetion condition which 
met the criterion of five consecutive correct 
responses are included in Table 1. Total 
errors and trial to last error are self- 
explanatory, and the values of these sta- 
tistics parallel the orderings of the mean 
learning curves. By alternations is meant 
the frequency of alternation between cor- 
rect and incorrect responses. The alternation 
frequency is not so closely related to the 
differences between the two payoffs for an 
item as one might expect, but there is a 
distinct tendency for alternation frequency 
to be high when the зит of the payoff 
values is low and for alternation frequency 
to be low when the sum of the payoffs is 
high. The last two columns of Table 1 give 


the relative frequencies with which an error 
on any trial is followed by an error on the 
subsequent trial and the relative frequency 
of errors during the portion of each protocol 
preceding the last error. Each of these last 
two proportions should be expected to be 
near .5 if learning occurs primarily on an 
all-or-none basis with choice probability 
remaining at chance until an abrupt 
change to the “absorption state” occurs. 
The only notable deviations from this pic- 
ture occur in the case of the items with the 
highest total point combinations, for which 
the proportion of errors during the pre- 
criterion sequence is higher than would be 
expected on the basis of chance alone. 

A similar presentation of data for the 
unequak,items is given in Figure 2 and 
Table 2 for the correction condition. In 
sharp contrast to the pattern exhibited by 
the noncorrection group, we find in the 
correction case almost no trace of any 
systematie relationship between rate of 
learning and payoff values. There are no 
instanees of absorption on the unfavorable 
alternative; about 93% of the items meet 
the criterion for learning of the correct 
response and in these instances, for all 
payoff combinations, error probability re- 
mains near chance during the precriterion 
sequence. The only hint of any orderly 
relationship between learning rate and 
point value is found if we examine the 
various statisties in relation to the number 


TABLE 1 


STATISTICS FoR UNEQUAL-PorNT ITEMS 
NoNcCORRECTION PROCEDURE 


Trial of last error Alterna- P P(E) 
Points Na I m SD ч (Е:Е) mare 
1 5.24 4.18 10.24 7.96 5.95 .40 AT 
14 т 5.43 4.49 10.17 7.89 5.50 „44 49 
16 48 4.96 4.72 8.25 7.12 4.31 51 .55 
18 47 3.15 3.09 5.36 5.62 3.04 .40 .51 
24 41 6.76 6.06 11.51 8.94 6.10 .50 .55 
26 47 3.30 3.18 5.83 5.78 2.94 .48 .50 
28 48 3.97 4.06 5.97 6.66 2.71 .51 .55 
46 45 6.71 6.12 10.58 9.04 4.93 .60 .60 
48 48 4.94 5.92 7.02 7.43 2.94 .65 . 66 
68 40 5.92 6.48 7.40 7.05 2.88 .67 70 
. Total 452 4.85 5.09 8.05 7.72 4.09 .52 56 


* Number of items meeting criterion of five consecutive correct responses. 
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Fic. 2. Learning curves for the correction group. 
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of points associated with the less favorable 
alternative for each item. Mean total errors, 
trial of the last error, and alternations all 
decrease slightly as the point value of the 
less favorable alternative increases. Be- 
cause this relationship appeared most con- 
sistent for the total error statistic, an 
analysis of variance was run, with the 
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four “treatment” groups being defined by 
the point value of the smaller alternative; 
the F value obtained was only 53 with 3 
and 69 degrees of freedom. 

In the case of the equal-point items 
(1-1, 2-2, ete.) it might seem at fim 
thought that no learning in the usual senie 
could be expected. Nonetheless, protocols for 
these items are strikingly similar to the 
protocols for unequal-point items: That is 
а typical protocol begins with alternations 
between the two alternative responses and 
ends with a long succession of occurrences 
of one response or the other. To permit 
analysis of these systematic changes in me 
sponse probability over trials, the following 
convention was adopted: For each item the 
response more frequently made over th 
entire series of trials was defined as the 
correct response for that item by that sub- 
ject and the other alternative was defined 
as the incorrect response. Once the com 
rect response for an item had been ® 
determined, the analysis proceeded in tht 
same way as described above for um 
equal-point items. The learning curves i$ 
terms of mean "error" frequency for а five 
trial block are given for both correctio 
and noncorrection groups in Figure 3 am 
the other summary statistics in Table $ 
The principal qualitative observation @ 
be made is that the percentage of item 
meeting criterion (approximately 90 fo 
each condition), the rate of learning, 


TABLE 2 
SraTISTICS FOR UNEQUAL-PorNT ITEMS 
Correction PROCEDURE 
_с_———Є———Є—Є——Є———Є—————Є——Є———————= 
= Total errors Trial of last error Mi P 
M SD M SD M 

45 2.53 2.53 4.13 4.05 2.36 45 E 
44 3.07 2.52 4.80 4.45 2.75 46 E 
16 45 2.93 2.51 4.69 4.08 2.53 48 E 
18 46 2.13 2.05 3.46 3.42 1.98 .43 E. 
21 m 2.18 1.95 3.30 3.10 1.98 43 9% 
26 45 2.13 1.83 3.67 3.58 2.20 .35 47 
28 46 241 2.31 3.0 3.82 2.24 42 8 
46 "m 2.30 1.73 3.0 2.52 1.77 47 8 
48 45 1.4 1.86 2.93 3.46 1.73 -40 E 
68 "m 2.07 1.63 2.84 2.42 1.84 40 @ 
Total 448 2.4 2.15 3.00 3.61 2.4 43 я 


* Number of items meeting criterion of буе consecutive correct responses. 


‘ined very close to chance during the 


TABLE 3 
Sratistics ron Equat-Porxr brews 


22 2.32 2.46 3.00918 2.05 AS a 
22 1.91 2.15 2.82 2.94 1.41 52 30 
20 LM. IAE 30 3.85 1.85 E з 
24 2.0 1.90 $04 2.80 1.02 E 5 
21 2.95 2.89 49 — 5.03 2.67 AS 53 
109 2.0 2.35 355 3.60 1.97 A5 52 

j Noncorrection procedure 
n 17 7.47 5.10 14.59 9.68 8.82 39 48 
2 21 8.49 5.08 15.76 10.58 8.86 A8 49 
4 24 4.58 4.59 8.71 7.9 4.96 42 47 
66 23 3.48 5.56 5% 7.7 2.78 ET 61 
88 24 4.08 5.81 6.67 8.73 3.08 .58 57 
Total 109 5.38 5.57 9.81 9.85 5.4 46 .51 


* Number of items meeting criterion of five consecutive correct responses. 


TABLE 4 


PRECRITERION STATIONARITY IN TERMS OF 
PROPORTION Correct PER VINCENT 


QUARTILE 3 
к> mcm ae Equal items 
Quartile Noncor- Correction NEO Correction 
1 39 En .49 47 
2 38 .45 ES 46 
j 45 .43 .55 34 
4 53 .57 .53 51 


the preeriterion trials for each protocol are 
divided into Vincent quartiles. Proportions 
of correct responses per precriterion quartile 
are given for pooled equal and pooled un- 
equal items for each procedure in Table 4. 
The general picture appears to be’ one of 
a slight upward trend in correct responding 
over the precriterion sequence, but with the 
proportion even in the final precriterion 
quartile not being much above .5. It seems 
likely that the upturn in the final quartile 
is due to the fact that some subjects in 
the final state have success probabilities 
slightly short of unity; a few instances of 
subjects who are in the terminal state but 
make an error before completing the full 
criterion run of five consecutive correct re- 
sponses would contaminate the estimate for 
the final quartile. 

Since details of the precriterion data are 
of special interest with respect to models 
for paired-associate learning (see for ex- 
ample, Bower, 1962; Estes, 1964a; Suppes 
& Ginsberg, 1963), a number of other rele- 
vant analyses are presented in Tables 5-8. 
In each case, in order to obtain reasonable 
numbers of observations, the data for each 
condition have been pooled over various 
payoff combinations. If learning occurs on 
an all-or-none basis, each protocol should 
begin with a mixture of successes and fail- 
ures followed, when the subject leaves the 
guessing state, by а sequence of correct 
responses (perhaps occasionally, in real 
life, interspersed with occasional “careless” 
errors). During the portion of the protocol 
prior to the final uniform sequence of cor- 
rect responses, the mean lengths of suc- 
cessive runs of errors should be approxi- 
mately constant, and the same should be 
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true for successive runs of successes. Con- 
versely, if learning proceeds by a gradual 
change in probability of correct responding 
the error runs should become progressively 
shorter and success runs progressively 
longer during the precriterion sequence, 
Further, if either the aftereffects of correct 
responses or those of incorrect responses 
should produce any gradual changes in re 
sponse probability, these should be reflected. 
in progressive changes in lengths of the 
success runs or error runs, respectively. 
Data on mean lengths of error runs ar 
given for both noncorrection and correction 
conditions and for unequal and equal items 
separately in Table 5 and similar statistic 
of success runs in Table 6. Referring to 
Table 5 first, it may be seen that for tht 
unequal items there is a modest but 
systematic tendency for length of error 
runs to decrease over the precriterion se 
quence. The trend is rather stronger for the 
noneorrection procedure, as might be ex- 
pected, since it is impossible for the subject 
to gain information about both the large 
and smaller payoffs associated with any 
given item on a single trial. For the corret 
tion condition there is the same amount 
information to be learned concerning each 
item, but in that case it is possible fo 
both  stimulus-response-payoff relation 
ships to be learned on a single trial. Some 
what in contrast to the result for error ru 
Table 6 exhibits relative constancy of met! 
lengths of successive success runs for bo 
types of items and both conditions. The 
data provide no support for the idea 
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TABLE 5 
MEAN Lenorus or ERROR RUNS 
Ordinal Noncorrection Correction 
number Te Fe 
Sirm Мел quency Mean — quem 
Unequal items 
1 2.38 386 1.87 и 
2 2.02 269 1.61 А | 
3 1.86 164 1.67 P, 
4 1.79 99 1.48 
Equal items 
1 1.90 83 1.99 | | 
2 1.97 60 12 ЖЫ 
3 1.90 39 1.58 
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TABLE 6 
MEAN LENGTHS or Success Runs 


Ordinal Hemen 7 немала 

numbe 

of ced Mean EL Mean CÓ 

Unequal items 
1 1.79 317 1.73 216 
2 1.83 197 1.56 93 
3 1.79 122 1.76 38 
4 1.77 7 1.40 10 
Equal items 

1 1.77 73 1.90 51 
2 1.92 50 1.50 22 
3 1.71 35 2.12 8 


satisfying aftereffects of correct responses 
produce any strengthening effect on the 
associations involved. Rather, it appears 
more parsimonious to conclude that no 
learning occurs on precriterion success 
trials. 

If the small but systematic deviations 
from stationarity in the case of the error 
statistics are due primarily to the fact that 
two separate items of information have to 
be learned in connection with each item, 
then we might expect a closer approxima- 
tion to stationarity on trials between the 
first success and the last error since in 
many, perhaps most, cases one of the as- 
sociations would have been learned by the 
trial of the first success. This supposition 
seems to be borne out by the data exhibited 
in Table 7. Under neither condition and 
for neither type of item (equal and unequal 
point) is there any tendency manifest for 
success probabilities to increase over this 
Portion of the precriterion sequence. 

If the general course of learning is for 
each item to shift on some critical trial 
from an unlearned state in which responses 
occur through random guessing into a 
learned state in which correct responding 
has probability of approximately unity, 
then during the precriterion portion of each 
protocol we should observe not only sta- 
tionarity of the error probability over 
trials but also an essentially random se- 
quence of successes and errors. To check on 
this character of the precriterion data we 
exhibit in Table 8 an analysis suggested by 


Suppes and Ginsberg (1963), namely, the 
distribution of error frequencies in non- 
overlapping blocks of preeriterion trials. To 
generate this table, each precriterion proto- 
col was marked off into successive four- 
trial blocks and the frequency of errors in 
each four-trial block tabulated. In the 
table are given for each type of item in 
each condition the observed frequencies of 
errors for all usable trial blocks together 
with the theoretical values generated from a 
binomial distribution with a mean equal 
to the observed mean. For both types of 
items under the correction condition the 
agreement between the observed distribu- 
tions and the corresponding binomial distri- 
butions is exceptionally close, the x*’s being 
far fron significant. The disparities are not 
very large even for the noncorrection pro- 
cedure with the equal items, but become 
substantial and significant for the noncor- 
rection procedure with unequal items. The 
nature of the disparities in this last case 
is that there is too much perseveration of 
both correct responses and errors from 
trial to trial, a result which is not unex- 
pected in light of previous analyses show- 
ing tendencies for subjects to become 
fixated on incorrect responses in the un- 
equal noncorrection condition especially 
when the pair of payoffs for an item is 
such that even the smaller value is rela- 
tively large. 


TABLE 7 


PROPORTION Correct ON TRIALS BETWEEN 
First SUCCESS AND Last ERROR 


Equal "e 


Trial Unequal items 
P(C) N P(C) N 
Correction 
T .555 182 644. 45 
2 .534 148 .500 34 
3 416 113 518 20 
Noncorrection 
1 .530 285 .574 68 
2 .550 251 .509 57 
3 .498 221 .402 52 
4 .43 192 .490 49 
5 .494 170 .532 47 
6 464 153 .981 42 
7 438 137 .590 39 
8 443 122 .500 36 
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TABLE 8 
DISTRIBUTION OF Error FREQUENCIES IN PRECRITERION Brocks 
al i Equal items 
(Fount blocks) (Four trial blocks) 
Fre- Noncorrection Correction Noncorrection Correction 
ута h Th ть 
Observed Theoretical Observed Theo; Observed Theo; Observed Theo: 
0 23 22 10 11 2 11 4 2 
1 149 120 53 49 61 47 12 11 
2 228 243 81 82 76 73 16 20 
3 156 219 57 61 4l 51 15 16 
4 121 74 20 17 16 13 8 5 
х? 56.0 1.20 14.41 4.20 


Learning curves for latencies, plotted in 
terms of mean latency per trial for correct 
and incorrect responses considered вера- 
rately are given in Figure 4 for the noncor- 
rection group and in Figure 5 for the cor- 
rection group. The principal trends are in 
many respects about what one might ex- 
pect by analogy with such situations as 
the T maze. Latencies of correct responses 
decrease somewhat over trials and ap- 
proach slightly lower terminal levels for 
the correction than for the noncorrection 
procedure. Error latencies are uniformly 
higher on the average than correct-re- 
sponse latencies and, although with con- 


siderable fluctuation owing to the lower 
frequencies of errors, the error latencies 
remain relatively constant over trials. Re- 
calling, however, that the frequeney an- 
alysis yielded considerable evidence that 
items are in distinctly different states of 
learning in the pre- and posteriterion por- 
tions of each sequence, we are led to in- 
quire whether any important effects may 
be masked by the overall curves of Figures 
4 and 5. 

Evidence pertaining to this point is ex- 
hibited in Table 9 for the noncorrection 
and Table 10 for the correction condition. 
In these tables mean latencies are given 
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Fia. 4. Mean latency of correct and error responses for criterion items: noncorrection group. 
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for pre- and posteriterion portions of the 
data separately, the posteriterion trials 
for each sequence being renumbered so that 
the first trial following the criterion run 
of five consecutive correct responses is 
numbered 1, and so on (means being in- 
cluded only for trials on which 20 or more 
observations are available). The picture 
that emerges is indeed quite different from 
that of the overall mean curves. On the 
precriterion trials, correct and error laten- 
cies are very close together and both tend 
to increase slightly over precriterion trials. 
The close agreement of the correct and er- 
ror latencies in this portion of the data fits 
nicely with the conclusion suggested by the 
frequency analysis that, prior to the cri- 
terion, items are in an unlearned state with 
correct and error responses occurring at 
random. The increase in mean latencies 
over the precriterion sequence may well 
be merely a matter of item selection; on 
the assumption that latency is related to 
difficulty of the items, an increasing trend 
might be expected in view of the fact that 
on the later precriterion trials the more 
difficult items will be relatively more heav- 
ily represented. 

During posteriterion trials, for the non- 
correction condition, latencies of correct 
responses are strikingly constant over 
trials; error latencies, occurring only for 
the unequal items, are uniformly higher 
but also relatively constant. The sugges- 
tion clearly emerges that the overall trend 
toward decreasing latencies of correct re- 
sponses exhibited in Figure 4 is primarily 
a matter of items shifting from the pre- 
criterion to the postcriterion state, the 
latency distribution for the precriterion 
state having a mean 100 to 200 msec. 
higher than the distribution for the post- 
criterion state. For the correction condi- 
tion, data shown in Table 10 yield a 
rather similar picture except that there is a 
slight but consistent decrease in correct re- 
sponse latencies during posteriterion trials. 

Finally, we consider the breakdown of 
latencies according to the point-value com- 
binations for the different types of items 
and the two conditions. In Table 11 the 
mean latencies are given for correct and 
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error latencies separately for the first 10- 
trial block and, for correct responses only 
(there being very few errors), for the 
fourth 10-trial block in the case of the non- 
correction condition. Similar statisties are 
given for the first and second 10-trial 
blocks for the correction condition in 
Table 12. 

For the noncorrection condition clear 
and substantial relationships emerge be- 
tween mean latency and point values, the 
principal trends being for latency to de- 
crease as the higher payoff of the pair 
increases and, when the higher payoff is held 
constant, to decrease as the sum of the 
payoffs for an item increases. For the cor- 
rection condition the trends are much 
weaker with the only apparent trend be- 
ing a slight tendency for latencies to de- 
crease as the sum of the payoffs for an 
item increases. On the whole, the picture 
seems quite favorable to an interpretation 
of learning in this situation in terms of the 
conditioning of approach and avoidance 
tendencies along much the same lines as 
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TABLE 9 
MEAN Latency or CORRECT AND ERROR RESPONSES FOR THE NONCORRECTION GROUP 
Unequal Equal 
Trial Precriterion Postcriterion Precriterion  Postcriterion _ 
c E с Е с Е с Е 
Sh Á 

1 1.35 1.31 1.29 1.52 1.33 1.42 1.31 = 
2 1.42 1.39 1.32 1.52 1.41 1.44 1.30 -= 
3 1.45 1.34 1.80 1.57 1.36 1.47 1.26 — 
4 1.37 1.38 1.31 1.50 1.36 1.45 1.28 — 
5 1.39 1.38 1.31 1.60 1.40 1.55 1.26 — 
6 1.48 1.40 1.28 1.50 1.48 1.34 1.25 = 
7 1.45 1.37 1.80 1.51 1.55 1.41 1.208 = 
8 1.40 1.40 1.28 1.61 — 1.50 1.29 | — 
9 1.40 1.38 1.32 1.39 — 1.40 1.30 — 
10 1.47 1.42 1.30 1.48 — — 1.25 — 
11 1.56 1.39 1.29 .1.37 — — 1.4 — 
12 1.87 1.39 1.28 Eas — — 1.28 = 
13 1.55 1.48 1.29 — — — 1.23 = 
14 1.45 1.46 71.31 — — — 1.97 = 
15 1.45 1.37 1.29 — — — 1.26, = 
16 1.42 1.40 1.30 — — — 1.30 = 
17 1.33 1.42 1.30 — - — 1.25 
18 1.47 "3781 1.29 — — — 1.20: = 
19 — 1.29 1.31 — — — Sy 
20 1.52 1.42 1.31 — — — 1.28... 


that assumed to characterize simple trial- 
and-error learning in such situations as the 
T maze. There is little encouragement for a 
picture of the subjects operating in the 
manner of decision theorists with longer 
latencies reflecting the more difficult decision 
problems; in Table 11, for example, we find 
lower lateneies for correct responses on a 


6-8 item than on a 1-8 and in the equal 
category lower latencies for the 8-8 than 
for the 1-1 or 2-2. Evidently the approach 
tendencies evoked by higher payoffs become 
conditioned and lead to lower latencies, 
but there is no indication that “approach- 
approach conflict” arises when both ге 
sponses to an item carry high payoffs. 


TABLE 10 
Mean Latency or CORRECT AND ERROR RESPONSES FOR THE Correction Group 
Unequal Equal 

Trial Precriterion Postcriterion Precriterion Postcriterion 
С Е с Е © Е © Е 

1 1.45 1.34 1.30 1.59 1.44 1.39 1.92. om 
2 1.37 1.44 1.30 1.51 1.48 1.48 1.02 “а 
3 1.43 1.41 1.29 1.50 1.45 1.41 1,98 196 
4 1.43 1.55 1.29 1.41 -— 1.33 1.30 
5 1.56 1.53 1.28 1.80 -- — 1.92. md 
6 Ed. EDI 1.25 = — - 1.24 тй 
7 1.60 1.61 1.26 25 — — 1:82. ud 
8 1.66 1.55 1.24 — — — Lidl’ ^" 
9 ag zx Гб M ccs s = 1:93 nite 
10 — — 1.20 = — — 1.28... m te 
11 — — 1.23 — — — 1:21. 4 
12 == — 1.22 — — — 1.28 Е 
13 — — 1.22 — J du 1.390005 
14 — — 1.19 — — — 1.22.11 Шш 
15 = = 1.25 — — 254 1.18 was 
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TABLE 11 


MEAN LATENCY or Correct AND ERROR Responses BY POINT COMBINATIONS Fen 
THE NONCORRECTION GROUP 


= Block 1 Block 4 
Points с Е а е m 
Frequency Latency Frequency Latency е Frequency ` Latency 
Unequal J jr mai 
12 322 1.42 157 1.45 392 1.40 
14 362 1.38 173 1.42 442 1.34 
16 417 1.36 166 1.41 465 1.30 
18 443 1.34 120 1.32 465 1.30 
24 314 1.41 179 1.42 382 1.36 
26 423 1.36 134 1.30 457 1.32 
28 446 1.29 127 1.35 471 1.30 
46 349 1.38 189 1.37 444 1.30 
48 415 1.31 164 1.39 471 1.27 
68 323 1.28 142 1.34 395 1.26 
Equal t 
11 84 1.47 66 1.51 135 1.33 
22 114 1.35 83 1.50 165 1.31 
H 158 1.36 82 1.50 197 1.35 
66 182 1.32 46 1.38 204 1.27 
88 175 1.30 51 1.36 225 1.22 


In view of the suggestion which was 
made earlier that some of the trends in 
the overall analysis might reflect differen- 
tial latencies associated with differential 
difficulty of items, we give finally in Table 
13 a latency analysis similar to that of 


Tables 11 and 12 but restricted to the more 
difficult items, specifically those for which 
the last error oceurred on or after Trial 10. 
Thus, this table includes latencies only 
for precriterion data of the more difficult 
items. The principal result of this subsid- 


TABLE 12 


MEAN LATENCY or Correct AND ERROR Responses BY POINT COMBINATIONS FOR 
THE CORRECTION GROUP 


Block 1 Block 2 
Points G E с 
Frequency Latency Frequency Latency Frequency Latency 
Unequal 
12 337 1.36 104 1.53 427 1.31 
14 319 1.38 125 1.47 427 1.31 
16 321 1.37 107 1.52 436 1.28 
18 361 1.36 96 1.39 448 1.29 
24 344 1.33 91 1.51 433 1.26 
26 352 1.31 96 1.46 428 1.27 
28 354 1.32 100 1.47 443 1.24 
46 345 1.37 101 1.43 425 1.32 
48 365 1.32 81 1.47 442 1.26 
68 343 1.31 85 1.39 430 1.22 
Equal 

11 166 1.36 47 1.49 201 1.28 
22 173 1.35 47 1.52 204 1.31 
44 161 1.32 37 1.38 190 1.33 
66 178 1.34 50 1.41 199 1.29 
88 158 1.29 50 1.37 191. 1.29 
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TABLE 13 
MEAN LATENCY or CORRECT AND Error RESPONSES By POINT COMBINATIONS FOR Dirricutr [reus 


Correction 


Noncorrection 


Points c E 


Frequency Latency 


Unequal 


12 21 1.80 28 1.94 
14 35 1.69 35 1.67 
16 21 1.67 23 1.62 
18 13 1.58 17 1.46 
24 6 1.70 13 1.75 
26 11 1.65 9 1.55 
28 20 1.57 18 1.46 
46 0° — 0 I. 
48 11 1.35 8 1.48 
68 0 = 0 Il 
Equal 

11 6 1.53 8 1.72 
22 0 = 0 БЕ 
44 13 1.50 17 1.40 
66 0 = 0 <= 
88 28 1.38 22 1.24 


Frequency — Latency 


c E 
Frequency — Latency Frequency Latency 
105 1.40 93 1.46 
99 1.46 105 1.45 
70 1.39 100 1.41 
51 1.48 48 1.30 
85 1.54 115 1.43 
59 1.30 81 1.22 
33 1.40 56 1.37 
71 1.42 138 1.34 
29 1.31 81 1.27 
26 1.19 71 1.35 
51 1.48 49 1.51 
62 1.42 66 1.52 
41 1.46 39 1.50 
12 1.35 18 1.23 
23 1.54 27 1.29 


iary analysis is to show that for this 
portion of the data the trends exhibited in 
the two preceding tables are in general 
accentuated. In particular, the difficult 
items for the correction procedure show 
the same: tendency for latency to de- 
crease with increasing payoff values that 
was seen previously in the full data for the 
noncorrection procedure. On the other 
hand, there are no substantial or consistent 
differences between correct and error la- 
tencies in these data, again supporting the 
conclusion emerging from earlier analyses 
that in the precriterion portion of each 
sequence items are in the same state of 
learning regardless of the point values. 
Nonetheless, even when items are in the 
unlearned state, response times are lower 
when the payoffs are higher, suggesting 
that approach and avoidance tendencies 
become conditioned to the stimuli some- 
what independently of the formation of 
specific learned associations between the 
stimuli and specific outcomes. Put differ- 
ently, learning that an item carries a high 
payoff and learning which alternative leads 
to the higher payoff are independent proc- 
esses which can be separated by a suitable 
analyses of the combined latency and fre- 
quency data. 


THEORETICAL ANALYSES 


One-Element Analysis of Learning of 
Unequal Items. In the case of the correc- 
tion procedure and unequal items, the sit- 
uation for subjects in this experiment was 
not greatly different from that of ordinary 
paired-associate learning experiments with 
two alternative responses per item. Thus, 
although the total situation was somewhat 
complex in the present instance, so that 
there may well be a bit more “noise” in 
the data than in some of the experiments 
reported by Bower (1961, 1962), we might 
expect that the course of learning would be 
approximately that prescribed by the one 
element, all-or-none model. The assump- 
tions of the model are, as interpreted in 
terms of the present situation, that each 
item begins in a guessing state in which 
the two alternative responses occur at гал- 
dom with equal probabilities, that the item 
remains in this state until the trial oP 
which learning occurs, that there is some 
constant probability c that learning will 0¢- 
cur on each reinforced trial, and that fol- 
lowing learning of a given item correct 16" 
sponses occur uniformly throughout the 
remainder of the sequence (Atkinson 
Estes, 1964; Bower, 1961; Estes, 19648): 
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It will be recognized that the pattern of 
results described in preceding sections 
agrees reasonably well with the picture 
implied by these assumptions on a quali- 
tative level, despite some relatively small 
second-order disparities (as, for example, 
the small rise in the Vincent curves for 
precriterion correct responding during the 
fourth quartile of the precriterion se- 
quence). 

Since for the unequal items run with the 
correction procedure there were no appre- 
ciable differences among different payoff 
combinations with respect to such statis- 
ties as total errors and trial of the last er- 
ror, the data for the various payoff condi- 
tions have been pooled for purposes of a 
more detailed application of the one-ele- 
ment model. Following the customary pro- 
cedure, the conditioning parameter c was 
estimated from the mean total errors, 
proving to be .214 for this condition, and 
then the predicted learning curve in terms 
of proportion of errors per trial was cal- 
culated as well as other statistics of the 
data such as standard deviation of total 
errors, mean and standard deviation of 
trial of the last error, etc. The mean learn- 
ing curve is compared with the theoretical 
curve in Figure 6, and the observed and 


60 — 
.50 L9 


.40 r7 


PROPORTION OF ERRORS 


theoretical values of various other sta- 
tistics are listed in Table 14. The outcome 
of this analysis, as in the case of previous 
studies of simple paired-associate learning 
with two response alternatives, is that the 
assumptions of the model together with 
the single estimated parameter account 
for many of the detailed properties of the 
data. 

For the equal items run under the cor- 
rection procedure the theoretical analysis 
is a bit more complicated. We might ex- 
pect that the association between the stim- 
ulus and the pair of payoffs for each of 
these items would be learned in essentially 
the same manner as the corresponding as- 
sociations for the unequal items. However, 
in the case of the equal items it is not 
clear from a priori considerations how a sub- 
ject’s behavior should change when the asso- 
ciation is learned, for neither the nature of 
the task nor specific instructions dictated 
any uniform mode of behavior for items on 
which the payoffs were equal for the two 
responses. In the absence of any other 
prescribed course of action, we might ex- 
pect, by analogy with results on effects of 
blank trials in probability learning (Estes, 
1964b) that in many cases subjects would 
settle on the simple strategy of making one 


© CORRECTION GROUP 
* NONCORRECTION GROUP 


5 10 15 


20 25 30 35 40 
TRIAL 


Fra. 6. Mean learning curves for criterion items and one-element model predictions. 
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TABLE 14 
OBSERVED STATISTICS AND ONE-ELEMENT MODEL PREDICTIONS FOR UNEQUAL-PoiNT ITEMS 
Correction Noncorrection 
Observed Predicted Observed Predicted 
© ‚214 ‚103 
Mean total errors 2.34 — 4.85 - 
SD 2.15 2.34 5.09 4.85 
Mean trial of last error 3.68 3.86 8.05 8.80 
SD 3.61 4.17 7.73 9.19 
Mean number of alternations 2.14 2.34 4.09 4.85 
Mean successes between adjacent 
errors .58 .65 .66 ‚81 
SD .94 1.03 1.06 1.2] 
Mean autocorrelation between Kth 
and K+-Jth trial 0 
J = 1 1.01 .92 2.15 2.18 
J=2 .77 .72 1.96 1.95 
Ј= 3 .60 .57 1.80 1.75 
Ј= 4 .45 .45 1.61 1.57 
J=5 .33 .35 1.54 1.41 
J=6 .23 .28 1.35 1.26 
Mean errors before the Kth suecess 
K=1 :96 .82 1.40 .91 
K = 2 1.42 1.86 2.14 1.64 
К = 3 1.78 1.70 2.80 2.24 
Mean number of J-tuples of errors 
Jesi 2.34 2.84 4.85 4.85 
Ј= 2 1.01 .92 2.53 2.18 
J.—.3 AT .36 1.56 98 
-4 .21 14 1.07 H 
J=5 07 .06 .77 20 
J=6 .02 02 57 09 
Mean number of runs of errors of 
length J 
J=1 .79 .86 1.34 1.48 
J=2 -28 «34 .49 .66 
J=3 12 13 .19 .30 
J=4 .09 .05 11 .13 
Mean number of runs of errors 1.33 1.42 2.32 2.68 


Note.—Only items which met a criterion of five consecutive correct, responses were included. 


or the other of the two responses uniformly 
to each of the equal items once the pay- 
offs had been learned. The protocols for 
the equal conditions suggest that such was 
the case; if we denote the more frequent 
response in each protocol as a “success” 
and the less frequent as an “error,” the 
frequency of errors, so defined, decreases 
over trials in much the same manner as do 
errors in the case of the unequal items. In 
fact, with successes and errors defined in 
this way, application of the one-element 


model to the pooled data for the equal items 
Yields a fit nearly as good as that for the 
unequal items and with a virtually iden- 
tical value of the conditioning parameter с. 
Further, as indicated in the previous sec- 
tion, the characteristics of precriterion re- 
sponding (the criterion now being five con- 
secutive “successes”) are in very 800 
agreement with predictions from the one- 
element model. The only substantial devia- 
tion from the model occurs in the case 0 
posteriterion responding. For the equa 
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items the proportion of “successes” on 
posteriterion trials is only .88 as compared 
to а corresponding figure of .98 for un- 
equal items. 

All of the available evidence is, then, 
in agreement with the notion that learn- 
ing of stimulus-outeome associations oc- 
eurred for the equal items in the same 
fashion as for the unequal items, that 
learning occurred on an all-or-none basis 
in accordance with the assumptions of the 
one-element learning model, but that post- 
criterion responding was not so uniform 
as in the case of the unequal items. 

There is no reason to expect the course 
of learning to have entirely the same prop- 
erties in the case of a noncorrection pro- 
cedure, since only incomplete information 
is available to the subject on each trial 
and, in general, associations between the 
stimulus and each of the outcomes asso- 
ciated with a given item must be learned 
(on different trials) before the learning of 
a given item is complete. Nonetheless, for 
purposes of comparison the one-element 
analysis was also applied to the unequal 
items run with the noncorrection pro- 
cedure, the results being included in Figure 
6 and Table 14. Even here, the fit is not 
bad by ordinary standards, but it can be 
seen that there аге a number of deviations 
between predicted and observed values 
considerably larger than any occurring for 
the correction condition. In particular, the 
statistics of Table 14 reflect the tendency 
already noted for excessively long runs of 
responses to occur under the noncorrection 
condition; for example, mean numbers of 
errors before the first, second, and third 
successes are considerably too large, and 
the mean number of alternations of success 
and failure is substantially smaller than 
predicted by the model. As might have 
been anticipated on the basis of the consid- 
erations advanced in an earlier section, the 
tendency for excessively long runs under 
the noncorrection procedure applies pri- 
marily to errors (arising, presumably, from 
an excessive tendency to make repeatedly a 
response yielding an unfavorable though 
relatively high payoff before both payoffs 
associated with a given item have been 


learned). On the other hand, runs of suc- 
cesses during precriterion responding are 
shorter than predicted. The pattern of dis- 
erepancies suggests the desirability of seek- 
ing to formulate а model for the noncorree~ 
tion condition which will have properties 
similar to the one-element model except 
that allowance should be made for the 
learning of two independent elements of 
information in relation to each individual 
item. 

A number of models which embody these 
ideas in slightly different ways have been 
formulated and examined in some detail 
by two of the authors (MC and WKE). 
The one which appears most promising 
upon initial application to the present data 
can be summarized in terms of the follow- 
ing transition matrix: 


1 0 0 0 
ca ] — ca 0 0 
T= | 0 1— e 0 
с с 
0 5 3 1—c 


The rows of the matrix from bottom to 
top (and the columns from right to left) 
correspond to the four states of learning 
among which transitions are assumed to 
oceur during the course of an experiment 
with the noncorrection procedure. Thé 
first state is the one obtaining at the be- 
ginning of the experiment, in which each of 
the two responses to each item is assumed 
to have probability 1%. On each trial when 
the subject is in this state there is probability 
1 — c of remaining and probability c that 
learning occurs, taking the system into ei- 
ther the second or the third state. In the 
second state the subject has learned the stim- 
ulus-response-outcome association involving 
the lower of the two payofis and in the third 
state he has learned only the stimulus-re- 
sponse-outcome association involving the 
higher of the two payoffs. Thus from the 
initial state the subject can go into the sec- 
ond state only on a trial on which the lower 
payoff is received and can go into the third 
state only on.a trial on which the higher 
payoff is received. 
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The response rules associated with the 
two intermediate states are somewhat more 
complex than any that occur in conjunc- 
tion with experiments run under a correction 
procedure. For, if the subject has learned 
that a payoff of, say, 6 points is associated 
with one of the two possible responses to a 
given stimulus, it is not clear whether the 
best strategy is for him to make that re- 
sponse on each subsequent recurrence of 
the stimulus in order to obtain at least 
the (better than average) payoff of 6 
points per trial or to try the other response 
in. order to make sure that it does not 
carry а still higher payoff of 8 points. It 
is reasonable to expect that when the subject 
has learned only that one of the responses to 
an item carries the lowest possible, payoff, 
one unit, he will always try the other re- 
sponse at the next opportunity; that when 
he has learned only that one of the alterna- 
tives carries the highest possible payoff, 
8 units, he will make that response uni- 
formly on subsequent occurrences of the 
stimulus; and that for intermediate states 
involving other payoffs his tendencies to 
stay with the known payoff and explore 
the unknown one will be related to the 
value of the known payoff. 

‚ To represent these notions formally we 
associate with each of the two intermediate 
states a parameter representing the ap- 
proach or avoidance tendency associated 
with the payoff which has been learned. 
For State 2 this parameter is B, and it may 
be interpreted as the probability that S, 
having learned the stimulus-response-out- 
come association involving the smaller of 
the two payoffs for an item will depart 
from that response and make the alterna- 
tive on any trial so long as he remains in 
the state. Similarly « represents the prob- 
ability that a subject in State 3, having 
learned the association involving the higher 
payoff, will nevertheless depart from that re- 
sponse and explore the alternative. The 
fourth state is the terminal one in which 
the subject has learned both of the pay- 
offs associated with the given item and sub- 
sequently makes always the response car- 
rying the higher payoff value. Clearly, а 
transition from State 2 or State 3 to State 
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4 can occur only on a trial on which a subject 
makes the response associated with the as 
yet unlearned payoff. Thus transitions 
from State 2 to State 4 occur only on 
“success” trials and transitions from State 
3 to State 4 only on “error” trials. About 
the parameters а and 8 we assume the 
following: when the higher payoff is the 
largest possible, 8 points in the present 
experiment, a response known to yield 
that outcome is always made and the cor- 
responding « value is zero. When the out- 
come known to be associated with the 
partieular response is the lowest possible, 
in the present experiment 1 point, that 
response is always rejected, and thus the 
value of 8 associated with the correspond- 
ing state is unity. For intermediate pay- 
off values it is assumed that the values of 
а and В are roughly proportional to the 
true probabilities that failure to accept a 
known payoff of a given value will yield 
a larger payoff on the alternative re- 
sponse. Since, in the present experiment, 
subjects would have had to learn these prob- 
abilities during the course of the experi- 
ment, it is not feasible to attempt to pre- 
dict in advance the values of а and £ that 
would obtain for each of the intermediate 
states on each trial. For purposes of pre- 
liminary applieation of the model to our 
data what we have done is to estimate 
values for the parameters associated with 
the three intermediate payoffs and then 
use these to predict statistics for all of the 
payoff combinations. 

For purposes of this report we limit our- 
selves to an evaluation of the adequacy of 
the model just outlined for the purpose of 
predicting the ordering of the various pay- 
off combinations with respect to total er- 
rors over the 40-trial sequence. By standar 
methods (see, for example, Atkinson | & 
Estes, 1963) the nth power of the transition 
matrix may be used to generate a function 
for the probability of an error on any trial, 
which proves to be 
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Summing this last expression over a block 
of N trials yields, finally, the following ex- 
pression for mean total errors: 


а 1[1-8 _ (1 с)" 
гою ]لے د‎ 


(1— ca)" , a(1 — ce)" 

1 — a * 1 — «a ] 
For the present experiment we take N 
equal to 40. The value of the learning 
parameter c can be estimated from total 
errors in the 8-1 condition, and when this 
is done the value obtained is .11. However, 
we can seek to predict the value of the 
conditioning parameter from the data of 
the correction group. We note that in 
terms of the general theory we have out- 
lined the principal difference between the 
conditions obtaining for the correction 
group on any unequal item and the non- 
correction group for an item with an 8-1 
payoff combination is that for a correction 
subject it is possible to learn both of the 
payoff values associated with a stimulus 
on a single trial, whereas for a noncor- 
rection subject this learning requires at 
least two trials. If we represent by c' the 
probability that an association between the 
stimulus and any one displayed payoff is 
learned on a single trial, then for a subject 
who begins a trial in the unlearned state un- 
der the correction procedure, the probabil- 
ity that at least one of the associations will 
be formed on a given trial is, on the as- 
sumption that the conditioning processes 
are independent, given by с + с — c^. 
For a corresponding subject in the noncor- 
rection condition, the probability that an 
association between the stimulus and the 
Outcome that occurs would form on any 
one trial should be equal simply to c". 
Thus, if we take the value of the condi- 
tioning parameter c of the one-element 
model estimated for the noncorrection 
group, and set this equal to the quantity 
2c’ — с?, and solve for c', we should obtain 
a prediction of the value of the condition- 
ing parameter for the  noncorrection 
group. Going through the necessary calcu- 
lations with our data we arrive at а 
value of .1135 as the predicted condition- 


ing parameter for the noncorrection group, 
in rather pleasing agreement with the 
value of .1124 estimated directly from the 
data of the noncorrection group. The 
theoretical value of total errors for the 8-1 
condition shown in Table 15 is based on 
this calculation. 

The same value of the conditioning pa- 
rameter was assumed to hold for all of the 
payoff combinations and estimates of the 
parameters « and 8 associated with the 
intermediate payoff values of 2, 4, and 6 
were estimated simply by a rough scanning 
procedure in which values of total errors 
were computed for various pairs of а and 8 
values running in each case in steps of .1 
from .1 to .9. The estimates of the shift 
parameter (a when the given payoff is the 
larger and 8 when it is the smaller) se- 
lected by means of this scanning procedure 
were .9 for the 2-point payoff, .8 for the 
4-point payoff, and .2 for the 6-point pay- 
off. As might have been anticipated on 
the basis of previous studies of behavior 
in betting situations and the like, our sub- 
jects tend to be somewhat more ready to 
accept an above-average payoff and some- 
what more ready to reject a below-average 
payoff than objective probabilities would 
dictate. In any event, using these estimates 
we computed the theoretical values of total 
errors for all payoff combinations, which are 
shown in Table 15 together with the ob- 
served values. The correspondence between 


TABLE 15 
OBSERVED AND THEORETICAL TOTAL Errors (ALL 
TRIALS) FOR UNEQUAL ITEMS ON 
NoNcoRRECTION PROCEDURE 


Rank 

Рау ӘӘ theoretical observed ZO 
84 4.41 4.415 9 10 
8-2 4.25 4.87 10 9 
8-4 5.64 5.43 7 8 
8-6 11.40 13.28 2 1 
2-1 11.19 8.57 3 3 
4-1 8.60 8.50 5 4 
6-1 6.19 6.63 6 7 
4-2 12.56 8.96 1 2 
6-2 5.52 7.09 8 6 
6-4 9.60 7.65 4 5 


a Predicted via c value estimated from correc- 
tion data. 
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predieted and observed ranks appears rea- 
sonably promising. 

For the equal items run with the non- 
correction procedure, there again arises 
the diffieulty that the experimental in- 
structions prescribed no uniform course of 
action for the subject once he has learned 
both of the payoffs associated with the 
given stimulus. On the whole, properties 
of the protocols appear on inspection to 
be roughly similar for the equal and un- 
equal items, as was the case for the cor- 
rection procedure. With the exception of 
the 8-8 payoff combination, the picture for 
the equal items deviates from that for 
the unequal in that the proportion of suc- 
cesses following the first criterion sequence 
is lower for the equal items. For,the 8-8 
condition, we might expect learning to be 
very similar for equal and unequal items, 
since in the equal case once the subject has 
learned one of the payoffs associated with a 
given stimulus, he should be expected to 
make that response uniformly on subse- 
quent trials. To check on this prediction, we 
can set the parameters « and 8 in the transi- 
tion matrix equal to 0 and 1, respectively, so 
that the matrix in effect collapses to the 
transitión matrix of the simple one-element 
model. Proceeding, then, to estimate the 
conditioning parameter from observed total 
"errors" of 5.04 for the 8-8 noncorrection 
condition, we obtain an estimate of .10 
for c which is not far from the value of .11 
obtained for the unequal items. 


SumMary AND CONCLUSIONS 


The learning task assigned subjects in 
this study may be regarded either as а set 
of concurrent trial-and-error learning prob- 
lems or as a modified paired-associate list 
with 25 different stimuli and 2 alternative 
responses. The conditions of reinforcement 
differed from those of usual paired-asso- 
ciate learning only in that the 2 alterna- 
tive responses to each stimulus carried 
different reward values rather than being 
simply eategorized as correct or incorrect. 
Numerous quantitative analyses of the 
learning protocols obtained under a cor- 
rection procedure, in which full informa- 
tion concerning the rewards associated 
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with both alternative responses to a stim- 
ulus was given at the end of each trial, 
indieated that the learning process could 
be approximated quite closely by the same 
one-element associative-learning model 
that has previously been found by Bower 
(1961, 1962) to describe learning of simple 
paired associates or verbal discriminations 
with two response alternatives. In data 
obtained under a noncorrection procedure, 
for which only the reward associated with 
the response actually made by the subject 
was shown on each trial, our analyses in- 
dicated that learning might be represented 
as a resultant of two independent one- 
element processes with respect to each 
stimulus. With rate of learning defined in 
terms of speed of error elimination, it was 
found that the magnitudes of the rewards 
associated with the alternative responses 
to a given stimulus had virtually no ef- 
feet in the full information condition but 
а large and significant effect in the partial 
information condition. 

The theoretical question of primary con- 
cern was that of whether the rewards in 
this situation exert any direct effect on the 
process of associative learning. Although 
it is always difficult to demonstrate the 
absence of an effect, some of our results 
point rather strongly in that direction. For 
example, in the full-information condition, 
the correct response to an item for which 
the higher and lower rewards were in a 2:1 
ratio was learned just as rapidly as the 
correct response for an item having high 
and low rewards in an 8:1 ratio, even 
though in the latter case the correct re- 
sponse received a much larger reward. The 
fact that errors were eliminated at different 
rates for the same two items in the partial 
information condition ean be interpreted 
purely in terms of performance factors 
Once a subject has learned that a response 
to a particular stimulus leads to a reward 
of 8 units, the largest possible in the situa- 
tion, he has no reason to deviate from this 
response to the given stimulus on later 
trials. However, when a subject has learned 
that a response to a certain stimulus carries 


a reward value of 2, there is considerable 4% 


motivation to explore the other respons 


xe 
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on subsequent trials since it might carry a 
larger reward value. A model embodying 
these notions concerning the dependence of 
performance upon the state of learning rel- 
ative to the possible rewards for any given 
item generated predictions accounting for а 
large part of the variance of total errors 
per item in the partial information group. 
Evidence from response latencies con- 
forms in all major respects to the conclu- 
sions drawn from the analyses of correct 
and incorrect responses. Although the over- 
all curves for latencies versus trials show 
decreasing trends, detailed analyses of 
pre- and posteriterion sequences for indi- 
vidual protocols show that changes in la- 
tency during learning can be interpreted in 
terms of all-or-none transitions between 
two discrete states. When an item is in the 
unlearned state—that is, the correct asso- 
ciation has not yet been formed—latencies 
come from a distribution with a relatively 
high mean, which is, however, equal for 
correct and incorrect responses. Once the 
correct association has been formed, the 
latencies for the item come from a distri- 
bution with a lower mean. Effects of re- 
ward magnitude upon response latencies 
are essentially parallel to those upon re- 


sponse frequencies. In the full-information 
condition, there are virtually no differences 
in mean latencies for items having different 
associated reward values. For the partial 
information condition, latencies are smaller 
for items carrying larger reward values. 
Our interpretation of the functions ob- 
tained for the partial information condi- 
tion is that when an item is in the inter- 
mediate state of learning (that is, only one 
of the two relevant associations has been 
formed), if the reward value that has been 
learned is relatively high, a subject readily 
accepts the corresponding response and per- 
forms it overtly with a short latency; if 
the reward value learned is relatively low, 
a subject tends to reject it and instead try the 
alternative response, a process which en- 
tails a longer latency. 

Taking the frequency and latency an- 
alyses together, the full pattern of evidence 
appears to support the conclusion that 
magnitude of reward in this situation is 
solely a determiner of performance: that 
is, the process of associative learning is 
independent of reward values, but after 
such learning has occurred, response selec- 
tion is a function of anticipated rewards. 
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STUDIES OF CODING IN VERBAL LEARNING 


BENTON J. UNDERWOOD axo ADRIENNE Н. ERLEBACHER 
Northwestern University? 


6 experiments are reported in which free learning (FL) and paired-associate 
learning (PAL) were examined with respect to the effects of coding of verbal 
units on learning. In 2 FL experiments and 1 PAL experiment where re- 
sponse terms were manipulated, encoding of trigrams to words produced a 
more meaningful unit. Such encoding was shown to influence learning posi- 
tively only if decoding was simple. Encoding of a stimulus term to a word 
was also shown to influence learning positively, but such encoding did not 
occur unless the possibilities were easily perceived. Finally, an experiment 
demonstrated sound coding of response terms, but the positive effect on 
transfer was small and limited to unmixed lists. We concluded that coding 
systems: (a) may influence learning positively if decoding is simple; (b) 
will produce only a small positive effect even under favorable conditions; 
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tions, inhibit learning. 


ike present experiments deal with two 
kinds of verbal-learning tasks as ve- 
hicles for the study of coding, and as a pre- 
liminary to presenting the background of 
the experiments these two tasks must be 
noted. One of the tasks is free learning 
* (FL). In FL the subject (S) is given a se- 
ries of verbal units to learn without regard 
{о order of the units. On successive trials 
the units are presented in varying orders 
and S may recall in any order he chooses. 
The second task is paired-associate learn- 
ing (PAL). In this task S is presented a se- 
ries of pairs, and he must learn to produce 
the second member of each pair when the 
first is presented. 

By way of leading to a discussion of cod- 
ing, we may note what is meant by no cod- 
ing for each of the two tasks. Without re- 
gard to the neurology of the matter, we may 
say that if in FL the verbal unit goes into 
memory storage in a form that reflects ex- 
actly the unit as presented, no coding has 
occurred. Conceptually, this is to say that 
the representation of the unit in memory 1s 
isomorphic to the unit as presented for 
learning, In PAL, if the response term 1s 
stored isomorphically and if it is produced 
1This work was done under Contract Nonr-1228 


/^ (15), Project 154-057, between Northwestern Uni- 
versity and the Office of Naval Research. 


n 


(c) may have no positive effect even if used and may, under certain condi- 


by a direct association with the stimulus 
term as presented, no coding has occurred. 
In a manner of speaking, S has acquired a 
“raw” association between the stimulus and 
response terms. It becomes apparent that 
positively speaking we mean by coding the 
changes, transformations, additions, sub- 
tractions, adumbrations, and so on which 
occur to and between verbal units as pre- 
sented and which are, we assume, reflected 
in what is stored as memory. 

If, following FL or PAL, Ss are interro- 
gated about the manner in which they 
learned the units or associations between 
units, an appreciable number will report 
forms of coding. As a group, these forms of 
coding are often spoken of as associational 
aids. However, it is apparent that such re- 
ports do not in themselves allow us to con- 
clude that coding has aided learning. Even 
if those Ss who report heavy use of coding 
systems learn more rapidly than those who 
report infrequent use of such systems, we 
cannot conclude that coding was responsi- 
ble for the more rapid learning. Indeed, it is 
quite possible that certain coding systems 
may inhibit learning. An understanding of 
the effects of various coding systems on 
learning can only occur when experimental 
control over the systems is in some way ex- 
ercised. Such efforts have been undertaken 
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in recent, years, the great amount of work 
on mediated associations (e.g, Jenkins, 
1963a) being most widely known. The pres- 
ent report represents another attack on these 
problems. 

We wish to emphasize a point of logic 
which pertains to all of the experiments to 
be reported. We, as experimenters (Es ) 
have devised certain materials which we be- 
lieved were of such nature as to induce the 
use of a partieular coding system in learn- 
ing. Within these materials we have devised 
further differences which we believed would 
change the learning rates if the coding sys- 
tem were used. If these attempts fail—if no 
differenees in learning among conditions 
emerge—we cannot conclude that S is not 
coding. We cannot conclude this"for two 
reasons, First, S may use a system of his own 
choosing which is not the one under experi- 
mental control, and the chosen system may 
not be related to the experimental variable 
(the coding system we inserted). Under such 
circumstances we “draw a blank." Second, 
S may be coding in a manner which corre- 
sponds to the system devised by E, but 
which does not influence the rate of learning. 
Such an occurence may sometimes be de- 
tected by certain forms of internal evidence 
(e.g., errors). 

As noted earlier, we will use both FL and 
PAL tasks. The general rationale of the ex- 
periments for each type may now be indi- 
cated. 

Free Learning. The initial experiments 
were a direct outgrowth of previous work 
(Underwood & Keppel, 1963). In this pre- 
vious study, Ss were presented trigrams for 
FL. The letters of each trigram, if re- 
arranged in either of two ways, produced a 
common three-letter word. The trigram USB 
would produce either SUB or Bus. Obviously, 
we attempted to utilize the law that the 
higher the meaningfulness the easier the 
learning; if S solved the anagram a more 
meaningful unit was available. If the word 
became the unit of memory storage, ac- 
quisition of the letters without regard to 
order should occur more rapidly than if the 
trigram as presented became the unit of 
storage. However, it does not follow that 
measured learning rate of the order of the 
letters of the trigrams as presented should 


be facilitated by this coding. If the proce- 
dure requires S to recall the letters of the 
trigram in the order presented, as it did in 
this study, and if S encodes the trigram to 
a word, he must also learn certain rules in 
order to “get back” the original trigram; he 
must learn a decoding rule for each trigram, 
Thus, this particular coding system ap- 
pears to have both facilitating and inhibit- 
ing characteristics. Overall, in the study, 
learning appeared to be inhibited somewhat 
by use of this coding system, and we pre- 
sumed that this was due to the decoding 
problems the task presented. Therefore, in 
the first experiment to be reported, we have | 
employed the same basic technique as in 
the earlier study but have used lists of ` 
single-solution anagrams as trigrams and 
have varied the ease of decoding. Ease of 
decoding was varied by varying the number 
of decoding rules. By a decoding rule we | 
mean the rule which prescribes how the let- 
ters of the word (encoded trigram) are to 
be rearranged to produce the trigram. Vary- 
ing the number of decoding rules means 
variation in the number of different rules 
needed to handle all trigrams in the list. 
Clearly, the expectation is that the greater 
the number of rules required, the slower the 
learning. This is to be expected because the 
greater the number of rules the greater the 
number of subsets of items which must be 
diseriminated and associated with particu- 
lar rules. 

If, in the experiment outlined above, 8 
codes according to the prescribed system, 
it can be seen that learning that a particu- 
lar decoding rule goes with a particular 
group of trigrams within the list may be 
considered a form of concept learning. In 
FL, if certain subgroups of words belong to 
the same conceptual class, learning is gen- 
erally facilitated as compared with the list 
in which items do not conceptually fit within 
subgroups (Dallett, 1964). Therefore, in a 
further study of coding in FL, we have 
sought to facilitate the pairing of a decod- 
ing rule with a particular group of words by 
making the particular group conceptually 
distinct. 


Paired-Associate Learning. The FL task * 


does not allow us to specify the effects of 
coding on acquiring particular associations 
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associations between а given stimulus and 
response term. We have, therefore, per- 
formed four experiments using PAL. How- 
ever, as will be seen, while PAL has certain 
advantages over FL їог studying coding 
processes, it also produces à more compli- 
cated picture analytically. 

It was noted earlier that one of the prin- 
ciples we have used in devising materials 
for studying coding was to utilize the laws 
of meaningfulness. We presumed that if a 
coding system made an item more meaning- 
ful, the system might be utilized. In PAL, 
meaningfulness differences among response 
terms have enormous effects on learning 
rate, whereas the same manipulations among 
stimulus terms have relatively little effect 
(Underwood & Schulz, 1960). This implies 
that meaningfulness differences are very im- 
portant in acquiring units as units, but are 
of relatively little importance in the devel- 
opment of associations between units. We 
are thus led to the conclusion that when à 
coding system js designed to make a unit 
more meaningful in PAL, its effectiveness 
should be much greater if applied to the re- 
sponse terms than if applied to the stimulus 
terms. In any event, this is the assumption 
on which we have proceeded. Again, we have 
used trigrams which may be transformed 
into words. Let A-B stand for the associa- 
tion to be learned, and X for the trans- 
formed unit. If B is transformed or coded 
(Xo), a single-step mediational chain is in- 
volved in learning à pair, the two associa- 
tions being from A to Xy and from X, to B. 
Since the B response required by E is, as in 
FL, a trigram, à portion of the learning task 
lies in decoding; S must get from X, back to 
B. Coding should greatly facilitate the ac- 
quisition of the letters within each response 
(response learning), but again, if decoding 
is difficult, the coding system may inhibit 
overall learning. Therefore, as in FL, we 
varied the ease of decoding by varying the 
number of decoding rules. Indeed, since 
is assumed to be a relatively pure form of 
response learning and since we assume that 
the major effect of the coding system in PAL 
will be on response learning, we may well 
expect that FL and PAL will produce the 

* same results when ease of decoding is varied. 

As we have noted, it did not seem profit- 


able at this stage to devise a coding system 
for the stimulus terms in PAL when the en- 
coded units of the system resulted in а 
change in meaningfulness. However, there 
is another auxiliary variable to which we 
may appeal in studying stimulus-term cod- 
ing, namely, formal similarity. Rate of PAL 
is inversely related to formal similarity 
among stimulus terms (e.g, Levitt & Goss, 
1961). If a coding system (involving encod- 
ing to words) could be inserted into the list 
such that its utilization would reduce the 
deleterious effects of interstimulus similar- 
ity, we might expect S to use the system. We 
would expect this since previous experi- 
ments (McGehee, 1962) have shown that 
high formal similarity among common 
words has little effect on learning. The sys- 
tem ought to be particularly effective since 
there is no problem of decoding—S never 
has to reproduce the stimulus term. There- 
fore, experiments were performed in which 
five-letter units were used as stimulus terms. 
Tn some conditions there was high letter du- 
plication among stimuli, in others, low du- 
plication. In conditions under both high and 
low similarity, the letters were so arranged 
that a common five-letter word could be 
easily produced (the anagrams were easy); 
in others, solution was possible but difficult, 
and in still others, the anagram was in- 
solvable. The expectation is that the most 
difficult task will be the list with high 
stimulus similarity in which no coding (ac- 
cording to our system) is possible, the easi- 
est that condition in which there is low 
stimulus similarity and easy coding poten- 
tial. 

Unfortunately, the above situation allows 
operation of another factor, and it should 
operate differentially across conditions. 
Given a list of stimuli of low meaningful- 
ness, having low interitem similarity, S may 
develop an association between à single 
letter of the stimulus and the response tern 
(e.g, Jenkins, 1963b). While stimulus se 
lection in this sense may itself be classe 
as а form of coding, it was not our intentio 
to study it here. Therefore, to minimize th 
possibilities of stimulus selection, we varie 
the order of the letters within each stimuh 
term from trial to trial. Our thinking W 
that this would make it difficult for S 
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“cull out” or select one or two letters as the 
functional stimulus when interstimulus 
similarity was low. So far as we know, this 
technique of producing an unstable stimulus 
from trial to trial—a stimulus in which the 
elements remain constant but their order 
varies from trial to trial—has never been 
used in PAL, The above experiment gener- 
ated some need to explore the technique fur- 
ther, and so in an additional experiment we 
varied interstimulus similarity and the num- 
ber of different orders in which the five let- 
ters Were presented from trial to trial. 

In the final experiment we turned to a 
different coding system, In all of the above 
experiments, the coding potential consisted 
of a rearrangement of letters designed to 
produce a more meaningful response term or 
to reduce the deleterious effects of interstim- 
ulus similarity. In this final experiment we 
attempted to determine whether or not 
sound coding occurs, and if it does, whether 
or not it relates to rate of learning. The 
pronunceability of three-letter units is 
known to predict their learning rates with 
considerable precision (Underwood & 
Sehulz, 1960). One may speeulate that this 
relationship is due to sound coding. By 
sound coding in this case we mean that a 
verbal unit, such as a trigram, can be “соп- 
densed" into a syllabic 
tively integrating the letters. Decoding 
should be relatively easy since $ has well- 
established phonetic habits and the number 
of different sets of three letters represented 
by a given Syllabie sound is limited. 

It will be seen that the major experiments 
deal with situations in which S is left con- 
siderable freedom of choice in his learning 
(according 


or not to use the system. We have not at- 
tempted to force S to use a particular sys- 
tem by making it impossible for him to 
learn if he doesn’t use the system. The ex- 
periments were not planned for such a level 
of analysis; rather, they were planned as a 
means of studying coding in the relatively 
free situation that faces S in the usual ver- 


of the experiments to be reported. While our 
interest is primarily in attempting to deter. 
mine if coding has occurred, and if 80, what 
its effects on learning are, we will also pre- 
sent data on other problems if the data rec- 
ommend it. The presentation of the experi- 
ments will follow the sequence in which they 
have been discussed above. 


ExPERIMENT 1 


In this experiment, FL was used with 
lists of 12 trigrams. If the letters of each 
trigram were Tearranged, a common three- 
letter word was formed. The evidence from 
the previous study (Underwood & Keppel, 
1963) suggested that at least some Ss did 
code in the expected manner but that their 
performance may have been inhibited by 
the coding. The reason for this inhibition 
(as inferred from errors) appeared to lie in 
the decoding process. No single decoding 
System or rule applied to all Words. Experi- 
ment 1, therefore, varied the number of dif- 
ferent decoding rules applicable to the items 
in the list, The lists were so constructed 
that in one condition a single decoding rule 
applied to all 12 trigrams; in another, two 
rules were needed, each applying to 6 of the 
trigrams; and in a third condition, four rules 
were required, each applying to a different 
group of 3 trigrams, In learning, all Ss were 
required to reproduce the trigrams at recall 
exactly as presented. 


Method 


Lists. All trigram lists were made up of varia- 
tions in the letter order of the following 19 words: 


2-1-8, 2-3-1, and -1-2. Four different lists, one 
corresponding to each of the four orders, were 
used for the one-rule condition (Condition 1R). 
"Thus, the trigrams in one list all had the 1-8-2 
order, and the presumed decoding rule is that of 
moving the last letter to the second position. An- 
other list had 


a particular rule since the four-rule list (Condition 
different trigram 
orders, Two lists were used for Condition 2R; in 
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& one list half the items had the letter order 1-3-2, 
half had 2-1-3; in the other list the orders were 
3-1-2 and 2-3-1. For Condition 4R a single list was 
used, three trigrams being assigned to each of the 
four trigram orders. There were, then, actually 
seven different lists; all represented variations in 


connection between letters of the trigrams in the 
lists as determined from the Underwood and Schulz 
(1960) letter-association tables were 13.03, 13.03, 
and 1158, for one, two, and four rules, respectively. 

Procedure. The lists were presented at a 3-sec- 
ond rate for six study trials, each study trial being 
followed by & 48-second written recall. Four dif- 


to be counted correct. Letter order within the tri- 
gram was clearly distinguished from the order of 
the trigrams which could, of course, be written in 
any order. Finally, it was pointed out to S that if 
the letters were rearranged a common word would 
result, and the illustration EOJ was given? 
Design. A total of 32 Ss was assigned to each 
of the three types of lists (one, two, or four rules). 
Eight schedule sheets were constructed with 12 en- 
tries on each. Each rule condition was entered four 
times on each sheet: for Condition 1R each of the 
four different lists was represented once; for Con- 
dition 2R each of the two lists appeared twice, 
and for Condition 4R, the single list was entered 
four times. The ordering of the 
schedule sheet was random, and а different random 
order was used for each of the eight sheets. The 
Ss were assigned to the entries on the schedule 
sheet in order of the appearance at the laboratory. 


Results 


Correct Responses. 16 will be remembered 
that during the six trials all Ss were re- 
quired to produce the trigrams as presented ; 
the three groups differed only in terms of 
the number of different decoding rules ap- 
plicable to the 12 trigrams in the list. The 
mean number of correct responses On each 
of the six trials for the three groups is 


2Tp this experiment, and in the following two 
experiments, additional conditions followed origi- 
nal learning. These conditions, involving changes in 
response requirements (e.g. instructing Ss that 


fore, we have chosen not to include these proce- 
dures and results. 


Fic. 1. Acquisition as а function of number of 
decoding rules (Experiment 1). 


shown in Figure 1. The Ss in Condition 1R 
were superior to those in the other two 
groups throughout the six trials; perform- 
ance under Condition 2R and Condition 4R 
showed essentially no difference. In terms of 
mean total correct responses OVer six trials, 
the F is 13.45. With 2 and 93 df, an F of 
4,85 is required for the .01 significance level. 
We conclude that number of decoding rules 
is а relevant variable in the learning of the 
lists but that its effect is limited to a differ- 
ence between one rule and two or more 
rules. 

Overt Errors. If Ss were attempting to 
learn by encoding to words and decoding 
back to the trigram and if the major diffi- 
eulty in this process lay in the decoding, we 
would expect S to recall the correct three 
letters for а given trigram but to have diffi- 
culty recalling them in the correct order— 
the order as presented. Furthermore, if the 
differences in learning as shown in Figure 1 
reflect differences in ease of decoding, the 
Ss having Condition 1R should have fewer 
such “good” errors than should Ss in Condi- 
tion 2R and Condition 4R. Over the six 
trials, Ss in Condition 1R averaged 1.50 
such errors; 17 of the 32 Ss made none. The 
Ss in Condition 2R averaged 5.94, those in 
Condition 4R, 6.81; the difference between 
these two conditions is not significant (F = 
33). If none of the Ss in any condition was 
using the encoding-decoding scheme "put 
into" the lists, we would expect no differ- 
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ence in learning and no difference in good 
errors, since the trigrams were compara- 
ble, if not identical, across-rule conditions. 
Therefore, the data support the conclusion 
that the Ss in Condition IR were to some 
extent using the relevant encoding-decoding 
scheme. 

Clustering. If the Ss in Conditions 2R and 
4R were coding, we might expect some clus- 
tering of the items in the recall protocols, 
the clustering being based on rules. Thus, 
in Condition 2R, one rule applied to six of 
the trigrams, another to the remaining six, 
If rules and trigrams were associated, clus- 
tering according to rules might be antici- 
pated. The sixth-trial protocols were scored 
for clustering by the formula r/t — n. In this 
formula, r represents the numberof times 
items of one rule are repeated without inter- 
ruption by items fitting another rule or by 
an error. Thus, the sequence AABBBABB 
would yield an r of 1 + 2 +1 = 4 The term 
t represents the total number of items re- 
called, and n the number of different rules 
Tepresented in the items recalled. In the 
above illustration, £ is 8, and n is 2. Errors 
were counted in determining t. As a second 
step, each S's items (including errors) were 
ordered randomly and scored for clustering 
exactly as above. This randomly determined 
clustering ratio was subtracted from the 
“real” clustering ratio for each S and this 
distribution of differences analyzed sta- 
tistically to determine if the mean differed 
from zero. For neither Condi- 
4R was clustering significantly 
different from that expected by chance, We 
must conclude that if coding was occurring 
in Conditions 2R, and 4R according to the 
“built-in” scheme, it was not reflected in 
clustering. 


Discussion 


that original learning with one decoding 
rule was more rapid than if two or four rules 
obtained among the trigrams, In considering 
the source of this superiority, both the en- 
coding and the decoding processes must be 
weighed. We assume that encoding to words 
will hasten acquisition of the items as such 
because of the higher meaningfulness of the 


items resulting therefrom. This positive ef. , 
fect from encoding, therefore, wil] show it- 
self in overall learning if decoding does not 
produce a negative effect of equal magni- 
tude. One or both factors must be involved 
in the observed differences in overall learn- 


ing between Condition 1R and Conditions à 


2R and 4R. It is possible that Ss in all con- 
ditions may have been encoding to words, 
with the differences in overall learning, 
therefore, being attributable to differences 
in the magnitude of the negative effect of 
decoding. Or, the Ss in Conditions 2R and 
4R may not have been encoding to words; 
rather, they simply acquired the trigrams as 
trigrams without regard to their codeability. 
The analytical problem therefore, is that of 
determining whether or not Ss in Conditions 
2R and 4R encoded to words. If it can be 
shown that encoding did occur in these two 
conditions, we would conclude that the su- 
periority of Condition IR could be at- 
tributed to a smaller negative effect in de- 
coding. 

It might be presumed that the question of 
Whether or not Ss in Conditions 2R and 4R 
Were encoding to words is answered by the 
fact that there was по difference in learning 
for these two conditions. Since decoding 
ease should differ for the two conditions, it 
might seem to follow that overall learning 
in Condition 2R would have been more 
rapid than in Condition 4R if Ss were en- 
coding to words. However, it is quite possi- 
ble that in neither condition did the Ss 
successfully use or even acquire generalized 
decoding rules, The decoding may have been 
specific to each trigram in terms of “mov- 
ing” specific letters rather than a general- 
ized rule of moving letters in a particular 
position for, Say, six of the trigrams in 
Condition 2R. 

Our persistence in attempts to answer the 
question of whether or not Ss in Conditions 
2R and 4R were encoding to words is pro- 
duced by the nature of the overt errors 
Which occurred in these conditions. Most of 
the errors were what we have called good 
errors, which consisted of the right three 
letters produced in the wrong order. In ac- 
quiring a list of trigrams as trigrams (with- 
out encoding to Words), a considerable num- 
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ber of integrative errors would be expected. 
By an integrative error we mean one which 
consists of three letters which do not go to- 
gether, such as two letters from one trigram 
and a third from another. Although we did 
not report the frequency of such errors for 
the present conditions, it can be said that 
their number was small and did not differ 
for the three conditions. In short, the nature 
of the errors strongly suggested that the Ss 
in Conditions 2R and 4R were encoding to 
words. However, because the interpretation 
of the results, not only of Experiment 1, but 
also of Experiments 2 and 3, was critically 
involved in the issue of whether encoding to 
words occurred in Conditions 2R and 4R, we 
undertook a subsidiary experiment to ob- 
tain a definitive answer. 

A Subsidiary Experiment. The compari- 
son to be made is between acquisition of а 
two-rule list in which the trigrams may be 
encoded to words and the aequisition of а 
list comparable in every respect except that 
the trigrams cannot be encoded to words. 
The former list will be called codeable, the 
latter, noncodeable. If Ss do encode to words 
in acquiring the codeable list, two findings 
must be obtained. First, the number of good 
errors must be greater for the codeable than 
for the noncodeable list. Second, the num- 
ber of integrative errors must be greater for 
the noncodeable list than for the codeable 
list. No prediction can be made concerning 
the rate of learning the two lists. However, 
if it is found that Ss given the codeable list 
do encode to words, differences in the rate of 
overall learning will tell us whether this en- 
coding facilitates or inhibits acquisition. 

One of the two-rule lists used in Experi- 
ment 1 constituted the codeable list. The 
12 trigrams were: XSI, YDA, IGB, NME, ETG, 
EWF, TLO, OBJ, XTA, ANC, wro, INW. The non- 
codeable list was generated from the code- 
able items by replacing the third letter in 
such a manner that the strength of the as- 
sociative connections between the second 
and third letters was comparable for both 
lists, the associative values being deter- 
mined from the Underwood and Schulz 
(1960) tables. The following 12 trigrams re- 
‚ sulted: хвү, YDO, IGC, NMU, ЕТФ, EWK, TLA, 
OBZ, XTI, ANK, WRI, INJ. Intralist similarity, 


as determined by the number of repeated 
letters, was essentially identical. 

The two lists were presented for six FL 
trials under exactly the same conditions ав 
prevailed in Experiment 1. A total of 20 Ss 
was assigned randomly to each list. The or- 
der of the items on successive trials was ex- 
actly the same for the two lists using the 
common first two letters of the trigrams ав 
the basis for ordering. The instructions for 
both groups were the same except that the 
Ss given the codeable list were told that if 
the letters of each trigram were rearranged 
a word would result. These instructions con- 
formed to those given in Experiment 1. 

The mean number of total correct re- 
sponses over six trials was 40.75 for the 
codeable“list and 40.40 for the noncodeable 
list (F = .01). There was no appreciable 
difference between the two groups on any of 
the six trials; performance on the codeable 
list was slightly ahead on the first three 
trials, slightly behind on the last three. On 
the sixth trial the means were 8.85 and 9.05 
for the codeable and noncodeable lists, re- 
spectively. Clustering according to rule was 
not significant in the codeable list (t = A6). 

The nature of the overt errors differed 
markedly. The mean number of good errors 
(right three letters in the wrong order) for 
the codeable list was 4.6, for the noncode- 
able, 0.9 (F = 13.66). The mean number 
of integrative errors (giving three letters 
which do not go together) was 2.35 for the 
codeable list, 5.05 for the noncodeable (F = 
5.10). With 1 and 38 df, the F for the 05 
level is 4.10, for the .01 level, 7.35. Thus, we 
conclude that the nature of the acquisition 
processes for the two lists was distinctly 
different although the rate of learning did 
not differ. 

The results of the subsidiary experiment, 
taken in conjunction with the results of Ex- 
periment 1, provide the following conclu- 
sions with regard to acquisition: 

1. The Ss in all three rule conditions 
coded to words. The differences in rate of 
acquisition must be attributed to differences 
in ease of decoding. 

2. Since the Ss in Conditions 2R and 4R 
did not differ with regard to rate of learn- 
ing, nor in terms of frequency of good errors, 
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we must conclude that in neither condition 
Were generalized decoding rules profitably 
applied—if they were &equired at all. 


EXPERIMENT 2 
In this experiment we will attempt to an- 
Swer two further questions growing out of 
Experiment 1. We have seen that a list of 
trigrams which can be encoded to words and 


Process in a one-rule list 
is so simple that acquisition is as'rapid as 
i the words per se were presented during 
Acquisition trials and Produced on recall 
trials. 


and four-rule lists, 
did not profitably apply, or even possibly 
did i decoding rules 


Consider a two-rule list. If S detected that 
only two rules wi 
tempted to use this 
seen that a fairly 
was i 


Method 


Lists. The experiment involved eight types of 
FL lists, each type representing a different experi 


As words or a 
constituted either two or fous 
concepts. The number of different variables ip. 


volved in these lists requires a fairly complex sym. 
bol system to designate the lists as follows: 


List W.2 List W-4 
List 1-2 List 1-4 

List 2.2р List 4-4P 
List 2-2U List 4-4U 


The W refers to the fact that the 
sented as words to S and were recalled as words. 
All other lists were 
recalled as nonwords, 
other lists refers to the number of decoding rules, 
The second number (or the only number for the W 
lists) refers to the number of concepts involved. 
Thus, for List 1-2, а 
рісаЫе, and the 12 
representing the names of objects falling into two 
Concepts, with 6 in each. The P refers to the fact 
t the concepts and rules were paired in the sense 


One of the questions asked above was whether 
ог not a one-rule 
rapidly as a word 
ing of List W-2 versus List 1-2, and List W-4 
versus List 1-4 will 


compared with the learning in Lists 2-2U and 44U 
to provide an answer to the second question. As 
сап be seen, therefore, four of the lists were used 
to answer one of the questions, four to answer the 
other. To a large extent in the treatment of the 
results we will treat these as two separate experi- 
ments although all conditions were run in a single 
experiment. 

the lists above, we spoke of types 


Tn discussing 
i stemmed from the 


TABLE 1 


Browr Lists REPAEAENTINO тив Erant 
Coxprrions or ExrERIMENT 2 


^ 2P 2:0 
кт ur ret 
on тог 10* 
MJA AMI AMJ 
YHA AYM AYM 
nny YER YER 
TOA ATO OTA 
nat pnt змі 
хво вхо вхо 
NCA CNA CNA 
RIA JRA AN 
TVA VTA VTA 
OKE кок кок 

14 44Р 440 
YHA NYA NYA 
MJA JMA AMI 
art rar "mo 
мл uM JMI 
MTO отм. ONT 
LHA ли, LHA 
NBI NBI InN 
хво хво охв 
TVA TVA TVA 
ЕТО окт тко 
WIA awi WIA 
GLE EGL кл 


HAY; HAL, ABE, BEN, JOB, JIM, том. From this pool 
of items, lists were constru i 
cedures at every point of decision, excepting one. 


rule, represented by the order 3-1-2, was used for 
The once rule lists. Data from Experiment 1 showed 
that this order ї 
differed from any other order. As noted above, two 
different sets of lists were derived. After one set 
was obtained by random procedures а completely 
1 Wew set was put together by further random 
cedures. 

‘Tie eight lists for one of the sets are shown in 
Table 1 as a means of illustrating n 
Tein to be noted that List W-2 consista of six words 
from each of two concepts. All 
the other two-concept lists consist of the rearing 
ment of the letters of these 12 words. In List 1-2, 
all trigrams i 
decoding rule holds. In List 2-2P, two rules (2-3-1, 
1-3-2) occur, and one holds for the six food * words," 
one for the six containers. In List , a rule 
applies to some items in both concepts. The com- 

i through fi 


all of the four-concept lists. Of course, in 
presenting the lists, the concept instances were not 
grouped. Rather, four different random orders 
were used. 

Considering the nonword units in both sets to- 
gether, the average summed letter-association val- 


[mdp name, or a part of the 
Design. А total of 32 Se was amigned to cach 
of the eight tista 


versus one rule as the only significant com- 
ponent (F = 58.15). Whether two or four 
concepts were involved was irrelevant, 

As may be seen in the right panel of Fig- 
ure 2, no simple answer was о ined to the 


MEAN TRIALS TO LEARN 
apu ^ e e 4 ж. 


4 2 4 
NUMBER OF CONCEPTS 

Fic. 2. Acquisition аз а function of number of 
concepts, trigram lists versus word lists 
(left panel), and pairing of rules and concepts for 
two- and four-rule lists (Experiment 2). 
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second question, the question being whether 
or not with two and four rules, acquisition 
could be facilitated if a distinction was 
made among the items paired with different 
rules, We may first note that when 

and rules were not paired (Lists 2-2U and 
4-40) there was no difference in the acqui- 
sition between the two-rule and the four- 
rule lists. However, both took more trials to 
learn than the one-rule lists (left panel), 
thus confirming the findings of Experiment 
1. With the concepts and rules paired, on the 
other hand, the learning of the two-rule, 
two-concept list (List 2-2P) appeared to be 
facilitated while the four-rule, four 

list (List 4-4P) appeared to be inhibited. 
An analysis of variance gave an F of 6.05 
for number of rules (identical to number of 
concepts) and an F of 7.36 for the interac- 
tion between paired-unpaired and number 
of rules, With 1 and 124 df, an F of 3.92 is 
required for the .05 level, 6.84 for the .01 
level. Using the within-groups variance esti- 
mate to obtain the error term, the differ- 
ence between Lists 2-2P and 2-2U gave at 
of 2.07, and for the difference between Lists 
4-4P and 4-40, 1.69. Thus, statistically 


cepts for the four-concept list actually in- 


shown shortly, the error data make it appear 
fact present. 


UNPAIRED 


2 


4 
NUMBER OF CONCEPTS 


. Errors in learning the two- and four-rule 
lists (Experiment 2). 


It is worth noting that With two rules | 
paired with two concepts (List 2-2P) per- 
formance was essentially equivalent to that 
shown for the one-rule lista. Finally, it 
be noted that for all nonword lists al] rels. 
tionships observed in Figure 2 were also 
evident in an analysis of performance on the | 
first two trials of learning. 

Overt Errors, The central interest lies in 
good errors—the right three letters given in 
the wrong order. We may note first that tbe 
number of such errors observed in the one- 
rule lists was greater than in Experiment 1. 
In Experiment 1 all Ss were given six trials 
and reached about 10 correct responses on , 
the average on the sixth trial. The mean to- 
tal good errors over six trials were 1.50, or 
-25 per trial, and 17 of the 32 Ss made no 
errors of this type. In the present study, in 
which Ss were taken to a eriterion of eight 
correct responses, the mean good errors per 
trial were .57 and .68 for Lists 1-2 and 1-4, 
respectively. The number of Ss not making 
such errors was 9 out of 32 for List 1-2 and © 
12 out of 32 for List 1-4. This increase in , 
error rate from Experiment 1 to Experiment 
2 is probably due to a joint effect of a more 
rapid rate of presentation of the items in 
Experiment 2 (2 seconds versus 3 seconds 
in Experiment 1), and a shorter recall pe- 
riod (36 seconds versus 48 seconds). 

Turning next to the two- and four-rule 
lists, the mean good errors per trial are 
plotted in Figure 3. For the unpaired lists 
it can be seen that there was little difference р 
between two and four concepts, but both 
means were higher than noted above for the 
one-rule list. Again, therefore, the results are 
quite consonant with those of Experiment 1. 
With the paired lists, however, quite a dif- 
ferent picture prevails. We have assumed 
that the learning of List 2-2P was facilitated 
by the coding system since the mean num- 
ber of trials to learn was equivalent to that 
shown for the one-rule lists, So, also, mean 
good errors per trial were also equivalent. 
With List 4-4P, however, the mean good er- 
ror frequency was very high and appears to 
correspond to the inhibiting effect in learn- 
ing noted earlier, (Statistical analysis of 
the four distributions represented in Fig-, 
ure 3 showed that both number of concepts 


^ 


the pairing 
Fs of 7.42 and 5.50, 


` Clustering. Clustering ratios were deter- 
„і for the recall protocols on the last 
ing trial. The method used was the 
; as that described for Experiment 1. 
both Lists W-2 and W-4, the mean clus- 
ing ratios exceeded а chance | 
2.80 and 2.00, respectively). This was 
true for List 1-2 (t = 3.49), but was not 
є for any of the other five types of lists, 
whether in the unpaired lists the scoring was 
by word concepts or by rules. 
Discussion 

If we examine the results of Experiments 
1 and 2 together, we may state some fairly 
of the coding 
should be clear that these conclusions per- 
tain only to instances in which changes in 
meaningfulness will result from encoding. 
We may note that the magnitude of the fa- 
cilitation in learning 
rule list was relatively small when com 
at one extreme with two- and i 
and at the other extreme, with a word list. 
Learning of a one-rule list was 
more rapid than the learning of à noncon- 
cept two- or four-rule list, but the rate was 


much slower than that shown for word - 


cept list was not facilitated at all by en- 
coding (when compared with а noncodeable 
list). Thus, the range of 
overall learning differences produced by the 
coding system in the present experiments 
was sharply limited, even though differences 
in the nature of the overt errors found in the 
subsidiary experiment of Experiment 1 
clearly indicated that the presence of the 
‚ coding system produced distinctly different 
“strategies” of learning. 


apparent in 
t 1 or Experiment 2 for the 
usual four-rule list. We attribute this in- 
crease in error frequency to at- 
tempts to associate decoding rules with con- 
cepts. 


EXPERIMENT 3 


In this experiment, as well as in all of the 
remaining experiments to be reported, PAL 
is involved, The critical terms in Experi- 
ment 3 are the response terms, and the var- 
iable is the number of decoding rules. The 
thinking behind the experiment is much the 
same as that behind Experiment 1, in which 
FL was used. Assuming that a major por- 
tion of overall PAL involves the acquisition 
of the as such, encoding the tri- 

to words should facilitate the acquisi- 
tion of the three letters making up the tri- 
. This follows from the fact that the 
encoded unit has higher meaningfulness than 
the unit as presented. However, again there 
is the issue of decoding; if the decoding is 
difficult the use of the built-in coding system 
may be inefficient. As in Experiment 1, ease 
of decoding was manipulated by varying the 
number of different decoding rules applica- 
ble to the 12 response terms. 
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Method 


Lists. The lists used as response terms were ex- 
actly the same as the lists used in Experiment 1. 
These lists allowed опе, two, or four decoding 
rules and represented Conditions 1R, 2R, and 4R. 
The sublists under these three basic lists were also 
exactly the same as in Experiment 1. The stimulus 
terms were the numbers 1 through 12, paired ran- 
domly with the response terms. Four different or- 
ders of the pairs were used in the presentation of 
any one list. 

Procedure. The lists were presented at a 4:3- 
second rate for anticipation learning, with the re- 
sponse terms being spelled. The 4-second anticipa- 
tion interval was used to allow sufficient time for 
decoding (if such occurred). Learning was carried 
to a criterion of eight of the 12 associations given 
correctly on a single trial. 

Design. A total of 32 Ss was assigned to each of 
the three types of rule conditions. The method of 
assignment was exactly the same as for Experiment 
E 


Results 


Trials to Criterion. Acquisition curves, 
plotted as mean trials to attain successive 
criteria, are shown in Figure 4. The F for 
trials to reach the criterion of eight correct 
Tesponses on a single trial was 4.26; with 2 


MEAN TRIALS 


3 4 


and 93 df, an F of 3.10 is required for the . 


-05 level of significance, 4.83 for the .01 
level. As in Experiment 1, the difference was 
produeed largely by the more rapid learning 
under Condition 1R as compared with Con. 
ditions 2R and 4R, with the performance 
under the latter two conditions differing but 
little. We conclude, therefore, that with one 
decoding rule, Ss used the built-in coding 
system and learning was facilitated thereby, 

Overt Errors. Again, an analysis of overt 
errors provides some evidence as to the 
source of the facilitation in Condition 1R, 
We have assumed that an effective coding 
system allows more rapid integration of the 
trigrams, hence acts primarily on the acqui- 
sition of the responses per se. We assume 
that increased meaningfulness (word versus 
trigram) will have little if any effect on the 
associative stage in the same sense that var- 
iation in stimulus meaningfulness has little 
effect on overall PAL. An assessment of 
these propositions can be made by examin- 
ing the nature of the overt errors. 

Good errors should be more frequent for 
Conditions 2R and 4R. than for Condition 


5 6 7 8 


SUCCESSIVE CRITERIA 


Fig. 4. Acquisition of paired-associate lists as a function of number of decoding rules (Experiment 3). ү 
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IR if decoding is facilitated by the applica- 
tion of a single rule. The mean errors per 
trial for errors of this type (including a few 
for each rule condition which were not 
paired with appropriate stimuli) were .22, 
95, and 1.01, for Conditions 1R, 2R, and 
4R, respectively. The F was 15.18. We con- 
clude that a single decoding rule facilitated 
ordering of the three letters. 

Failure in the associative stage would be 
indicated by the correct three letters being 
given in the correct order but to an inappro- 
priate stimulus. The mean errors per trial of 
this type were .58, .42, and .44, for Condi- 
tions 1R, 2R, and 4R, respectively. The F 
was less than 1. Thus, once а trigram was 
fully integrated, there was no difference in 
the rate at which it was associated to the ap- 
propriate stimulus term; number of decod- 
ing rules did not influence the associative 
stage. 

The third class of errors would be repre- 
sented by S giving three letters which do not 
belong together, or giving partial responses 
(one or two letters). The mean numbers of 
such errors per trial were .37, АЗ, and .63, 
for the three conditions in order. These 
means did not differ significantly (F — 
146). The implication of this finding will 
be discussed at a later point. 


Discussion 


The results for Experiment 3 closely par- 
allel those found for FL in Experiment 1 in 
which the same trigrams were used. When 
the trigrams could be decoded by a single 
rule, performance was superior to that 
shown when two and four rules were re- 
quired. The latter two conditions have not 
differed in either experiment. 

The error data indicated that the facilita- 
tion in the one-rule list resulted from ease 
of decoding which effectively allowed easy 
ordering of the three letters. Thus, for Ss 
having the one-rule list, we presume that an 
association was established between the 
stimulus term and the encoded trigram (a 
word) and that the single decoding rule al- 
lowed S to decode easily, thereby producing 
the three letters in the appropriate order. 


. Furthermore, we believe the evidence indi- 


cates strongly that Ss in Conditions 2R and 
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4R were also encoding to words and that an 
association was established between the 
stimulus term and the word. This position 
may be contrasted with a position that Ss 
were acquiring the trigrams аз trigrams and 
not encoding to words, hence acquiring an 
association between the stimulus term and 
the first letter of the trigram аз presented. 
We will summarize the evidence which leads 
us to the first position and at the same time 
denies the validity of the second position. 
That is, we will summarize the evidence in- 
dicating that Ss in Conditions 2R and 4R 
were encoding to words. As а reference 
point, we will use Condition 1R where en- 
coding to words must have occurred to some 
degree. There are two lines of evidence. 

1. Oné type of error did not differ in fre- 
quency for the three conditions. These were 
partials and the giving of three letters which 
did not constitute the three letters of a tri- 
gram in the list. If Ss in Conditions 2R and 
4R were learning without encoding to words 
(if they were simply acquiring the three let- 
ters in the order given on the tape), such 
errors should have been much more frequent 
for Conditions 2R and 4R than for Condi- 
tion 1R. On the other hand, if Ss in all three 
conditions were decoding to words and ac- 
quiring an association between the stimulus 
terms and the words, there is no reason why 
the type of error under consideration should 
differ for the three conditions. 

2. If Ss in Conditions 2R and 4R were 
decoding to words, but having difficulty in 
decoding, the right three letters should be 
given to the appropriate stimulus term but 
in the wrong order (good errors). "Therefore, 
more such errors should have occurred for 
Conditions 2R and 4R than for Condition 
IR. As was shown in the previous section, 
this is exactly what happened. In effect, 
such errors indicate that the associate phase 
occurs prior to the complete response-learn- 
ing stage. Such results are found only when 
some mechanism makes possible the knowl- 
edge of the correct three letters without con- 
comitant knowledge of how the letters 
should be ordered (Underwood, Ekstrand, & 
Keppel, 1964). We believe that knowledge 
of the three correct letters was provided S 
early in learning by encoding to words, but 
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appropriate decoding habits were not imme- 
diately available except in Condition 1R. 

The above lines of reasoning lead to the 
conclusion that Ss in Conditions 2R and 4R 
were encoding to words just as were the Ss 
in Condition 1R and just as were Ss in all 
rule conditions in Experiments 1 and 2. That 
Condition 2R did not differ from Condition 
4R suggests that again Ss did not correctly 
apply, or even possibly did not learn gen- 
eralized decoding rules but rather learned 
specifie rules for each response term. In 
short, the interpretation of the results for 
PAL learning of Experiment 3 is the same 
as the interpretation applied to the results 
of Experiment 1. 


EXPERIMENT 4 e 


With this experiment we turn our atten- 
tion to stimulus coding. In viewing the stim- 
ulus term in PAL we assume that the func- 
tional stimulus is а stable response to the 
formal stimulus term which allows differen- 
tiation between it and functional stimuli 
elicited by the other formal stimulus terms 
in the list. There would seem to be a consid- 
erable range of possibilities for advanta- 
geous coding since S does not have to pro- 
duce the formal stimulus term—he does not 
have to decode it. The stimulus-selection 
studies, referred to in the introduction, rep- 
coding in the sense that 
the formal stimulus term and the functional 
stimulus are not isomorphic. Stimulus selec- 


response term is a highly relevant variable 
for learning; therefore, the coding systems 
always involved a transformation to pro- 


success when stimulus coding is involved be- 
cause meaningfulness differences of stimulus 
terms produce only slight differences. in 
learning. We have turned, therefore, to an- 
other task variable, intralist similarity, as 
a means of making stimulus-coding Studies 
more profitable. 

Assume a stimulus term in a paired-asso. 
ciate list is Еѕмок. This anagram can be 
easily solved to produce ѕмоке, but whether 
or not the functional stimulus is SMOKE or 
ESMOK should make little difference in over- 
all learning since differences in meaningful- 
ness have so little effect on learning. As- 
sume, however, that another stimulus term 
is EMSUO. The two uncoded stimuli, Eswok 
and Emsvo, have very high formal similar- 
ity, and as uncoded stimuli would be ex- 
pected to retard learning as compared with 
two stimuli with low formal similarity. This 
would be particularly true if the order of 
the letters of each stimulus is varied from 
trial to trial to make stimulus selection 
(such as a single letter becoming the fune- 
tional stimulus for each) difficult. We know 
that formal stimulus similarity among com- 
mon words has little influence on PAL (Me- 
Gehee, 1962), If, therefore, the two stimuli 
are encoded to the words swokg &nd MOUSE, 
the deleterious effects of intralist similarity 
should be reduced. It is by such procedures 
that we have attempted to manipulate cod- 
ing processes of stimulus terms. 

In Experiment 4, three variables were ma- 
nipulated. The first variable was whether 
or not a five-letter stimulus could be trans- 
formed to a word. In halt the conditions the 
anagrams (stimulus terms) were solvable, 
in half they were not, As a second variable, 
the ease of solution was varied, and as a 
third, formal intralist similarity, 


Method 


Lists. Eight different lists, differentiated by the 
stimulus terms, were used to represent two levels 


8 will be used to designate the solvable lists to dif- 
ferentiate them from the unsolvable lists (U). 
There were eight five-letter stimuli for each list. 
The S stimuli of one list consisted of the letters of 
eight common English words, oceurring 50 or more 


- 


+ 


SrvpiEs or CODING 15 


times per million words as given by Thorndike and 
Lorge (1944). The five letters of each stimulus rep- 
resented single-solution anagrams and arranged as 
words were: CROWD, MAJOR, SMOKE, BRAVE, JUDGE, 
FAULT, KNIFE, PORCH. The ease of solving anagram 
stimuli is heavily dependent upon the order in 
which the letters are presented, Of the 120 orders 
into which five letters may be permuted, Mayzner 
and Tresselt (1958) have identified 10 letter or- 
ders, 5 of which Ss translate quickly into words 
and 5 of which take significantly more time for 
translation. In the present experiment, the ease of 
encoding to a word was manipulated by using the 
easy (E) and difficult (D) letter orders. Four of 
the five E orders were chosen randomly, and four 
of the five D orders were chosen in the same man- 
ner for use in the present experiment. For example, 
the four E orders for CROWD, were CRODW, OWDCR, 
rowpc, and pcrow. For the same word, the four D 
orders were DRWCO, CWRDO, RDOCW, and ocwrp, Thus, 
two lists are formed from the same eight words, 
these two lists representing solvable anagrams, with 
easy and difficult letter orders (S-E and S-D). For 
the E list, on suecessive trials in PAL, the four dif- 
ferent orders of the stimuli were used—for four 
trials the letter orders of each stimulus changed 
from trial to trial. The four different orders were 
then repeated on the next four trials and so on. 
Thus, the Ss were presented with a constantly 
changing stimulus in terms of the order of the 
letters but with constancy in the particular five 
letters, The D list was handled in exactly the same 
way in that the four difficult orders of the letters 
occurred on successive trials. 

The third variable, intralist similarity, was var- 
ied in terms of the number of duplicated letters 
among the eight stimuli. For the above lists, 22 
different letters were used, and it is considered a 
low-similarity list. Letting LS stand for low simi- 
larity, the two lists described thus far would be 
symbolized S-E-LS, and S-D-LS. For the high- 
similarity (HS) stimuli the following eight words 
were used: REACH, TRAIN, DANCE, STORE, THIRD, STAND, 
cHosE, NOISE. These words were equivalent in fre- 
quency to the LS list, but only 11 different letters 
were involved in the eight words. As with the LS 
list, these words represented single-solution ana- 
grams, and again, the E and D orderings of the 
letters were used to form two lists. Thus, these two 
lists were symbolized as S-E-HS and S-D-HS. —— 

We turn next to the lists with unsolvable stimuli 
(U lists). These stimuli were constructed from the 
S stimuli merely by replacing а vowel with another 
vowel such that the five letters could not produce 
a word. Since the letters of each stimulus could not 
be rearranged to form a word, these lists serve as 
controls for the S lists. For example, скор became 
CRIWD, MAJOR became MOJER, and so on. This was 
done for both the high-similarity and the low- 
similarity lists. The ease-of-solution variable was 
also included, although clearly it does not have the 
same meaning as it does for the 8 lists. Neverthe- 
less, by ordering the letters to correspond to the 


easy-solution stimuli of the 8 lists, and also ao- 
cording to the difficult-solution orders, it seemed 
possible that the differences in the wordlike char- 
acter of the stimuli which resulted might influence 
the learning. However, we had no hypothesis con- 
cerning the effect of this E-D variable for the U 
lists; their use was necessary, however, for control 
purposes. 

In summary, it can be seen that the eight. lists 
were constructed so that four of the lists had stim- 
uli which allowed solution to words and four did 
not (S versus U). Within each four, the solutions 
were easy or difficult (E versus D), and the re- 
maining variable was high similarity versus low 
similarity (HS versus LS). The response terms for 
all lists were five-letter words, having а Thorndike- 
Lorge frequency of 10-12 per million, These words 
Were BACON, JOUST, SHADY, TARRY, LOGIC, SMASH, 
FERRY, SCOWL. These response terms were assigned 
randomly to the stimuli for the LS lists and again 
to the stimuli for the HS lists. 

Procedure. Upon entering the laboratory 8 was 
assigned to one of the eight lists from а predeter- 
mined schedule sheet. The schedule sheet con- 
sisted of 10 blocks of the eight conditions (lists), 
with the order within blocks being random. Thus, 
a total of 10 Ss was assigned to each list. 

Learning of the lists was by the anticipation 
method with a 4:2-second rate being employed, 
and a 6-second intertrial interval. Learning was 
continued until all eight responses were anticipated 


since three different conditions were involved, lit- 
tle bias should result. These Ss were replaced with 
three additional Ss. 

In addition to the usual PAL instructions, Ss 
were informed about the changing order of the 
letters of the stimuli, but no mention was made of 
the possibility that the five-letter unit might be 
encoded in any way. 

After S attained the criterion, stimulus recall 
was requested. Each of the response terms was pre- 
sented successively at а 6-second rate with 8 in- 
structed to give the five letters which had been 
paired with the word during learning. The Ss were 
further told that they were to give as many letters 
as possible even if they did not remember all five 
of the letters. The instructions did not specify any 
particular order of the letters. Two such stimulus- 
recall trials were given. Finally, S was given a 
printed list of the response words and was asked 
to write down all of the letters he could for the 
stimulus paired with each response word. On one 
column of blank spaces he was asked to write the 
stimulus letters which he was sure were correct; in 
another column of blanks he was asked to record 
letters which were more or less guesses. No time 
limit was imposed. 


Results 
Trials to Criterion. The mean numbers of 
trials required to reach eight correct re- 
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MEAN TRIALS TO LEARN 


Low 


HIGH 


STIMULUS SIMILARITY 


Fig. 5. Acquisition of paired-associate lists as 
a function of stimulus-term similarity and other 
stimulus-term properties (Experiment 4). See text 
for complete explanation. 


Sponses on a single trial for each of the eight 
lists are plotted in Figure 5. It may first be 
noted that stimulus similarity influenced 
rate of learning for all types of lists, with 
more trials required for HS than for LS. 
The F for similarity was 23.91. Two other 
sources of variance were significant. The 
easy-difficult variable (E-D) produced an 
F of 4.64, and the E-D x S-U interaction 
gave an F of 8.14, With 1 and 72 df, an F of 
3.98 is required for the .05 significance level, 
7.01 for the .01 level. The significant effects 
Were produced primarily by the more rapid 
rate of acquisition of the S-E lists than all 
other lists. The S-E lists had solvable stim- 
uli with easy letter orders. As implied by the 
reasoning in the introduction to this experi- 
ment, an interaction between similarity and 
E-D for the solvable stimuli was antici- 
pated. More particularly, the effects of simi- 
larity should be less with the E lists than 
with the D lists on the grounds that with 
the E lists more Ss would solve the anagram 
stimuli and thus diminish the inhibiting ef- 
fects of high intralist similarity, With LS, 
anagram solution should have relatively lit- 


tle effect on learning; with HS, the 
should be appreciable. 

The learning for these four S lists 18 rep- 
resented by the two solid lines in Figure 5, 
While the differences were in the expected 
direction the interaction was far from sig. 
nificant. The S-D list required more trials 
to learn than either of the U lists with Hs 
stimuli, However, these differences were far 
from significant, In the course of perform- 
ing the experiment, two additional groups of 
10 Ss were each given the S-D and the U-D 
lists, and for these Ss the mean trials to 
learn for the S-D list were 24.2, for the U-D, 
25.9. Therefore, we must conclude that there 
are no differences in learning between the 
U lists and the S-D lists. The fact that the 
stimuli were solvable appears to have been 
irrelevant unless the order of the letters 
made the anagram easy to solve (S-E lists). 

Overt Errors. For each S the mean num- 
ber of overt errors per trial was determined, 
and the mean per list calculated. The fewest 
errors occurred for the S-E lists with the 
others producing about the same number at 
each level of similarity. For all lists, more 
errors occurred for the HS lists than for the 
LS lists (F = 7.81). 

Stimulus Recall, Stimulus recall was car- 
ried out in two stages. First, S was given two 
paced recall trials in which each response 
term was shown for 6 seconds and S was 
instructed to give as many letters of the ap- 
propriate stimulus term as possible. The 
Correct stimulus term was never shown. At 
the second Stage, S was given the response 
terms on a sheet of paper and was asked to 
provide the correct stimulus for each, list- 
ing those letters he was sure of in one col- 
umn, and those which he was not sure of or 
had guessed in another. Since there was so 
little difference in the results between the 
first paced and the second paced recall trial, 
and between the two paced and the unpaced 
recalls, unless noted otherwise, the results 
will be given only for the first paced recall 
trial. 

The major purpose for taking stimulus re- 
call was to determine whether or not Ss had 
been encoding to words in learning the solv- 
able lists. Since Ss were never informed that 
the encoded letters would produce words 


effect 


— — 
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and since the stimulus-recall instructions 
merely requested S to give the appropriate 
letters of the stimulus for each response 
term, if the appropriate five-letter words 
were given as stimuli, it would support the 
conclusion that Ss were encoding the stimuli 
to words. 

The results were quite unambiguous on 
this issue. Appropriate five-letter words 
were produced only for the two S-E lists; 
not a single five-letter word was given for 
either of the S-D lists. For List S-E-LS, all 
10 Ss produced at least one appropriate five- 
letter word, the number varying from 1 to 8, 
with an average of 4.3 words per S. For List 
S-E-HS, 7 of the 10 Ss produced at least 
one appropriate five-letter word, with an 
average of 2.6 per S. When a word was given, 
pairing with the appropriate response Was 
without error for either list. It will be re- 
membered that in terms of the five letters 
involved in the stimuli, the S-D lists were 
identical to the S-E lists. The S-D lists took 
more trials to learn than did the S-E lists, 
and there was no evidence of encoding to 
five-letter stimuli in the S-D lists whereas 
there was in the S-E lists. It must be con- 
cluded that encoding to words was responsi- 
ble for the more rapid learning of the S-E 
lists. 

Considering now only the S-E lists, more 
trials were required to learn under HS than 
under LS (t = 2.20). Since fewer words were 
given in stimulus recall for the HS list than 
for the LS list, we presume that there was 
less stimulus coding (to five-letter words) in 
the HS list than in the LS list and that the 
difference in learning is reflected in the dif- 
ference in coding. As noted earlier, little or 
no difference as & consequence of coding 
was anticipated for these two lists on the 
grounds that coding would effectively coun- 
teract the deleterious effects of high inter- 
stimulus similarity. Why this did not occur 
is not clear although it is not an unreason- 
able hypothesis to suggest that high formal 
similarity among anagrams may retard so- 
lution of the anagrams. 

A number of analyses were made on the 
characteristics of stimulus recall for the six 
lists where coding to five-letter words was 
absent. For all lists à few two-, three-, and 


four-letter words were given as stimuli, but 
by far the predominant mode-of recall did 
not include words. The acquisition of these 
lists may be considered a problem of stimu- 
lus selection in which the critical variable 
is similarity. The higher the similarity the 
greater the number of letters required to 
produce а differentiating stimulus. The av- 
erage number of correct letters in the stimuli 
produced was significantly greater for the 
HS lists than for the LS lists. Such evidence 
is not completely “olean,” however, because 
the probabilities of a letter's being correct 
by guessing were greater for HS than for 
LS. Nevertheless, the finding is probably 
valid since the same result, occurred when 
S was given unlimited time and was asked 
to give only the letters he was sure of (F — 
8.67). The results also show, however, that 
if we consider only responses consisting en- 
tirely of correct letters (regardless of num- 
ber), which were also correctly paired, more 
of the eight stimuli were given correctly by 
Ss learning the LS lists (an overall mean of 
59) than by those learning the HS lists 
(M = 45). Of course, those that were cor- 
rect for the HS lists consisted of a greater 
number of letters than those correct for the 
LS lists. 
Discussion 
The discussion may be brief. The results 
were quite clear in showing that stimulus 
coding could to some extent be controlled by 
7. However, the coding system built in ap- 
pears to have been discovered only when it 
was fairly obvious that the letters when re- 
arranged produced à word, and this occurred 
only when the letters were in the easy order. 
Tf the letter orders made the anagram à 
difficult one to solve, the learning proceeded 
much the same as it did in the control lists 
where solution was not possible. We do not 
know whether or not S discovered that the 
difficult anagrams could be solved; however, 
there was no evidence in the stimulus recall 
that the words became the functional stim- 
uli. 


ExPERIMENT 5 


In the preceding experiment S was faced 
with an unstable stimulus term in the sense 
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that four different orders of the letters were 
used for each stimulus term. In conven- 
tional PAL the stimulus term is identical 
from trial to trial. We presume that varying 
the order of the letters from trial to trial 
makes the learning task more difficult be- 
cause it increases the difficulty involved in 
establishing a stable response to the stimu- 
lus to serve as the differentiating cue for the 
response-term association. Or, to say this 
another way, we assume that when the 
stimulus letters occur in a different order 
from trial to trial the process of stimulus 
selection is retarded. Nevertheless, that 
varying the order of the stimulus letters 
retards learning is an assumption, and it is 
the purpose of the present experiment to ex- 
amine the validity of this assumptica. 

The stimuli for the present experiment 
Were again five-letter units, but in no case 
could the letters be encoded to words. With 
these materials the hypothesis that the rate 
at which a stable response to each stimulus 
is established is inversely related to the 
number of different letter orders, was tested. 
The reference condition corresponded to tra- 
ditional PAL in which the order of the let- 
ters for each stimulus is invariant from trial 
to trial. In a second condition, two different 
orders of the letters were used, and in a 
third, four different orders, this latter cor- 
responding to the procedure used in Experi- 
ment 4. Interstimulus similarity was again 
varied with the expectation that the two 
variables would interact: that is, high in- 
terstimulus similarity was expected to be 
more and more detrimental to learning as 
the number of different orders of the stimu- 
lus letters increased, 

The forgetting of lists of verbal units has 
been shown to be essentially unaffected by 
task variables such as meaningfulness and 
intralist similarity, In the present experi- 
ment a further task variable (number of 
different letter orders of the stimuli) is ma- 
nipulated. Given the extreme instances of 
this variable, and of stimulus similarity, it 
is apparent that we have possibilities for a 
further test of the constancy of forgetting 
across-task variables, Therefore, 24-hour re- 
call was taken for all lists. 


Method 


Lists. Each stimulus term consisted of five let- 
ters, four consonants and a vowel. For the low. 
similarity (LS) lists, 18 consonants (d, q, and y 
excluded) and four vowels were used. To construct 
the eight stimuli, eight sets of four consonants 
each were drawn randomly from the pool of con. 
sonants subject to the restrictions that a given 
letter was not repeated within a set of four and 
that each consonant occurred at least once but not 
more than twice in all eight sets. Four of the five 
vowels were then chosen randomly and each as. 
signed to two sets. Thus, eight five-letter Sets con- 
stituted the eight stimuli. Four different, random 
letter orders were then generated for each of the 
eight stimuli. In the four-order condition each of 
the stimuli assumed each of its four letter orders 
once in every four trials (118-4). In the two-order 
condition, each of the stimuli assumed two letter 


In constructing the high-similarity lists (HS), 
the 18 consonants were listed in order of frequency 
of use in words and 9 of these, representing the 
full frequency range, were used. Two of the five 
vowels were also chosen. Eight stimuli were then 
randomly formed from this pool of letters, sub- 
Ject to the restrictions that no letter was repeated 
within a stimulus, that each consonant was used 
at least three times but not more than four times, 
and that each of the two vowels was used four 
times. The different orderings of the letters then 
took place exactly as with the LS lists, 

From these procedures, the LS list (for one or- 
dering of the letters) was as follows: OTVBK, MXECH, 
TPAJS, OGXLR, EFKLW, PUVJH, SBCAN, WEZFN. The HS 
list was: BUNTK, FJANS, CNSBA, PBUFK, TPSJA, KCBPU, 
РСАЈТ, STUFJ. The above procedures were repeated 
to produce an entirely independent set of lists. 
These two sets were used equally often, but since 
they did not introduce a significant source of var- 
iance, no distinctions between sets will be used in 
presenting the results. We will Speak of six lists, 
15-1, 18-2, LS-4, HS-1, HS-2, and HS-4, where the 
number refers to the number of different. letter 
orders of the stimuli in presenting the lists for 
learning. 

The response terms were the same for all lists, 
and consisted of the following eight words: GROPE, 
WHINE, WAKEN, BLINK, MANLY, STUNG, RIGID, and 
FOCUS. The average frequency of these words in 
the Thorndike-Lorge list is about 10 per million. 

The pairs of the lists were presented in four dif- 
ferent orders for learning by the anticipation pro- 
cedure with a 4:9-second rate, and a 6-second inter- 
trial interval. Ап equal number of Ss was started 
on each order. 

Procedure. A total of 40 Ss was assigned to each 
of the six types of lists (20 to each type within a 
Set), the assignment being random. АП Ss had had 
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previous verbal-learning experience. Original learn- 
ing was carried until S correctly anticipated six re- 
sponses on a single trial. If S failed to reach the 
criterion in 40 trials he was dropped and replaced. 
Four Ss were dropped for this reason, three for 
failure to learn an HS-2 list, and one for failure to 
learn an HS-4 list. Thus, the effects of similarity 
will be slightly underestimated. 


The usual paired-associate instructions were 
given with an addition which informed all Ss that 
the order of the stimulus-term letters might vary 
from trial to trial although the same five letters 
would always be paired with a given response term. 

Recall and relearning occurred 24 hours follow- 
ing original learning. Five relearning trials were 
given to all Ss. The order of the pairs, for recall, 
hence the order of the letters within the stimuli, 
was the same order as that on which S achieved the 
criterion of learning on the previous day. 


Results 


Trials to Criterion. The mean numbers of 
trials to reach the criterion of six correct 
responses for each of the six types of lists 
are plotted in Figure 6. The effect of simi- 
larity (F = 64.86) and number of letter 
orders (F = 27.65) were both significant be- 
yond the .01 level of significance. The F for 
the interaction (F = 4.57) was just short of 
the .01 level; with 2 and 234 df, and F of 
3.03 is needed for the .05 level, 4.69 for the 
01 level. Thus, interstimulus similarity 
again was shown to be a highly significant 
variable and, clearly, difficulty in learning 
increased as the number of different orders 
of the stimulus terms increased. However, 
as can be seen from Figure 6, most of the 
change occurred between one and two letter 
orders, with only slight increases in diffi- 
culty between two and four letter orders. 

Acquisition curves revealed no informa- 
tion not given in trials to reach the criterion. 
However, it occurred to us that in the more 
difficult lists some bias might be evident for 
the letter orders occurring on à particular 
trial. Such a bias would be evident in а 
cyclical-like curve, peaking on the trials on 
which those particular letter orders oc- 
curred. The usual trials-to-successive-cri- 
teria curves would not detect such an effect. 
Therefore, for each list “backward” curves 
were plotted in which the mean number cor- 
rect was determined on the criterion trial 
minus 1, minus 2, and so on. The only evi- 


dence for a cyclical effect occurred with 
List HS-4 (the most difficult list), and this 
was not a pronounced effect. 

Overt Errors. The mean number of overt 
errors per trial in reaching the criterion of 
six correct responses Was calculated. No 
source of variance attained statistical sig- 
nificance. Even an independent test for the 
differences between HS and LS with four 
orders of the stimulus letters did not achieve 
significance. This result may appear at first 
to contradict the findings of Experiment 4 
where errors rates did differ as a function of 
similarity. However, if only the lists having 
nonsolvable stimuli in that experiment are 
considered (corresponding to those used in 
the present study), the difference in error 
rates between HS and LS is not significant 
(F = 2.71). 

Recall and Relearning. The recall scores 
were essentially identical for all six types of 
lists. For the LS lists the mean numbers of 
items recalled were 3.58, 3.78, and 3.75, for 
one, two, and four stimulus orders, respec- 
tively. The corresponding values for the HS 
lists were 3.73, 3.83, and 3.70. Thus, neither 
interstimulus similarity nor number of or- 
ders of the letters in the stimuli during 
learning influenced recall. Probability anal- 
yses used to adjust for possible minor dif- 
ferences in the degree of learning resulting 
from different rates of approach to the eri- 


LS 
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NUMBER LETTER ORDERS 
Fic. 6. Acquisition as a function of stimulus- 


term similarity and number of different letter 
orders of the stimulus terms (Experiment 5). 
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terion in learning did not change the con- 
clusion that recall Was not influenced by the 
two variables which influenced learning. 

The relearning scores following recall re- 
flected quite exactly the differences observed 
in original learning. Both number of letter 
orders and interstimulus similarity were 
statistically significant (F = 4.58 and 15.09, 
respectively), but the interaction was not 
(F = 2.35). 


Discussion 


Varying the number of different orders of 
the five letters of the stimulus terms from 


Figure 6 that most of the effect occurred in 
changes from one order of the letters to two; 
the change in learning rate associated with 
the increase from two to four letter orders 
was slight, either with high or with low 
similarity among the stimuli. Stimulus cod- 
ing in the present experiment required only 
that a stable and differentiating response be 
acquired to each stimulus term or a portion 
of it. It is clear that this process was re- 
tarded by high stimulus similarity and by 
varying the orders of the letters from trial 
to trial. Yet, once such stable responses were 
acquired, their integrity over time was con- 
stant regardless of the difficulty of their ac- 
quisition. The invariance of the recall scores 
indicates this, 


EXPERIMENT 6 


In the final experiment we have attempted 
to determine if sound coding occurs in PAL. 
Let a paired associate be represented by 


whether or not the Tesponse term B may be 
Stored as a sound unit despite the fact that 


as à sound-coded unit, and second, some de- 
coding rule must be acquired whereby S can 


Let the corresponding pair in the second list 
be А-ЕОЕ. The paradigm is ostensibly A-B, 


A and the sound. On this basis alone, very 
heavy positive transfer would be expected, 
However, a negative component may be 
present in the decoding process, since the 
identical sound unit in the two cases must be 
decoded differently in order to reproduce the 
correct three letters, Nevertheless, if the 
overall transfer effects were Positive (as 
compared with the appropriate control list 
where possibilities for transfer of sound cod- 
ing is minimized) we would conclude that 
sound coding had occurred, 


Method 


identically by all Ss, and further contained tri- 
grams which had high formal similarity to the 
former ones but would be pronounced differently. 
The printed list of 72 trigrams was given to 116 


(gave the same word). These varied from 47% 
(47% of the Ss gave the same word) to 95%, when 
the following three criteria were imposed as added 


ulus number, and so on. 
For each item in the E list a control trigram. was 
chosen to form a control list (C list), Each C item 
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TABLE 2 
Response TERMS Usep IN EXPERIMENT 6 TO 
Srupy Sounp CODING BY А TRANSFER DESIGN 


First list Second list 
- س‎ 
E list C list All groups 
HUR HOR HER 
SEY SIY SAY 
FOH FOM FOE 
CRI CRO CRY 
KIQ KIJ KICK 
PSU MSU SUE 
AKE AFE ACHE 
EDT EDGE 


EDJ 


of the E item with the word, but its pronunciation 
was distinctly different from the word. Thus HOR 
is the C item for HER, and both were assigned the 
same number as the stimulus term. If sound coding 
of the response terms occurs as S learns the first 
list, the positive transfer from this source should 
be greater for the E list than for the C list. 

The E and C lists, as given in Table 2, are ho- 
mogeneous or unmixed lists in terms of their rela- 
tionship with the second list. That is, each item of 
the E list had the same pronunciation as its cor- 


the items in the C list had the same pronunciation 
as the words in the transfer list. As а potential aid 
in interpreting any differences in transfer between 
the E and C lists, it seemed advisable to test also 
for transfer due to sound coding through the use 
of mixed lists. Therefore, two mixed lists were con- 


the first four 
items of the E list (as given in Table 2) plus the 


total of four conditions, each representing а differ- 
ent first list (E, C, M-1, M-2), and all having the 
same list on the transfer test. One additional char- 
acteristic of the lists should be noted. The stimulus 
numbers from 1, 2,...... 9, were paired in order 


with the response items for which pronunciation is 
appropriate for transfer, and the larger numbers 
were associated with response terms for which pro- 
nunciation is not appropriate. 
situation was reversed. Thus, th 


of the four co ditions. Four schedule sheets were 
constructed with 24 entries on each such that each 


condition occurred six times, with the order being 
random and different for each sheet. The Ss were 
assigned to the schedule sheets in order of their 


The first-list learning was carried until 8 cor- 
rectly anticipated all items on a single trial. The 
second list (transfer list) was presented for one 
study and 10 anticipation trials. For both lists the 
rate of presentation was 22 seconds, with an inter- 
trial interval of 4 seconds. The response terms in 


Results 


First-List Learning. 'The mean number of 
trials to learn the first, list to a criterion of 
one perfect recitation varied 10.96 to 14.50. 
The F was 1.60. Thus, the combined effects 
of different lists and different groups of Ss 
did not appreciably influence learning of the 
first list. 

Transfer Effects. The acquisition curves 
for the four lists across the 10 transfer trials 
are shown in Figure 7. It is to be noted that 
performance on the E list was superior to 
that for the C list, while performance on the 
two mixed lists (M-1 and M-2) was some- 
what inferior to that on the C list. As an 
early measure of transfer the mean correct 
on the first two trials was determined for 
each S. The F was 10.08, with 3 and 92 df, 
and F of 2.71 is significant at the .05 level, 
4.86 at the .01 level. However, it is apparent 
from Figure 7 that the major source of var- 
iance resulted from the superiority of learn- 
ing the E list to the learning of the other 
three lists. Using the within-groups variance 
for determining the error term, the Ё for the 
mean difference (1.95 items) between List 
E and List C was 2.19. Thus, the results for 
the two homogeneous lists indicated a source 
of positive transfer for List E that is greater 
in magnitude than that for List C. We as- 
sume this stems from the transfer of а 
sound-coded response for List E and not for 
List C. 

The F for mean total correct across 10 
trials was also significant statistically (F = 
3.66) but the difference between List E and 
List C was not (t= 1.59). As is clear in 
Figure 7, the difference in performance on 
the E and C lists essentially disappeared 
after six acquisition trials. 
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MEAN CORRECT RESPONSES 


The mixed lists in Figure 7 did not appear 
to be “behaving” simply as if they were con- 
stituted of half C items and half E items, If 
positive effects were produced by the four 
E items and neutral effects by the C items, 
the performance curves for the mixed lists 
should fall between those for the Е and Cc 
lists. This was obviously not the case. 
Rather, it appears that some factor pro- 
duced a small amount of interference, To 
examine this situation more carefully, we 
need to look at the performance on the two 
subsets of items within each mixed list, one 
subset being sound codeable, the other not. 
We may, for comparative purposes, use the 
performance on these same two subsets from 
the E and C lists, 

One of the subsets in the transfer list con- 
sisted of the pairs with the response terms 
HER, SAY, FOE, and ory. The other consisted 
of KICK, sun, ACHE, and EDGE. We have de- 
termined the mean total correct responses 
across the 10 transfer trials for each sub- 
set. Differences in performance must be at- 
tributed to differences in the transfer from 
the first lists. The means are plotted in Fig- 
ure 8. The transfer differences, in turn, are 
assumed to result from differences in sound 
coding of the response terms, 
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Fic. 7. The effects of sound coding on 10 transfer trials (Experiment 6). 


Looking first at the performance on the 
two subsets for the mixed lists, it can be 
seen that there was little difference. For 
List M-1, mean correct on the two subsets 
Was essentially identical—the second sub- 
set, which was sound codeable, was not su- 
perior to the first subset which was not 
sound codeable. For List M-2, the sound 
codeable subset (first subset) was somewhat 
ahead of the noncodeable subset, but the 
difference was not significant statistically 
(6 = 1.73). Furthermore, the performance 
on the codeable subset for List M-2 was not 
better than for the same items in the C list. 

Performance on the mixed lists was gen- 
erally inferior to that on the unmixed lists. 
For the first subset of items the difference 
between mixed and unmixed gave an F of 
3.92; for the second subset, the F was 9.26. 
An F of 3.94 is required for the .05 signifi- 
cance level (1 and 92 df). The difference 
between the mixed and unmixed lists on the 
second subset Suggests that there was inter- 
ference in the mixed lists, for performance 
on this subset for both mixed lists was infe- 
rior to the performance on these items in 
the C list. Such an argument cannot be made 
for the first subset, however, where the dif- 
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Fra. 8. The effects of sound coding on 10 transfer 
trials for subsets of items (Experiment 7). 


ference appears to represent à facilitation 
in learning the items in the E list. 

Finally, it may be noted that Figure 8 
shows that the facilitation of the E list over 
the C list was largely confined to the first 
subset. On the second subset there was only 
a small difference. We have been unable to 
determine a reason for this difference. An 
obvious difference between the two subsets 
is that the second had three four-letter 
words as response terms, but it is not ap- 
parent to us why the transfer of a sound- 
coded response should be any less because of 
this. The two subsets were almost identieal 
in terms of the pronounceability sealing 
which had been done on the trigrams prior 
to the experiment. 

Overt Errors. Error analyses have not 
provided information of additional ana- 
lytical value. Mean errors per opportunity 
across the 10 transfer trials did not differ 
significantly for the four lists (the F was 
less than 1). It was noted in introducing the 
present experiment that if S sound codes 
the response terms in learning the first list 
and if this transfers to the E list, the asso- 
ciation between the stimulus term and the 
sound-coded response term remains intact 
from the first list to the second list. It would 
follow, therefore, that if transfer occurs by 
this route, the Ss learning the B list should 


have fewer misplaced responses in acquiring 
the transfer list than those learning the C 
list. The mean errors of this type were 2.04 
for the E list and 2.58 for the C list, but the 
F was less than 1. 


Discussion 


The data indicate а small net positive 
transfer effect in learning the E list that was 
not present in learning the C list. We have 
interpreted this effect as being due to the 
fact that in learning the first-list 8 sound 
codes at least some of the response terms. 
Therefore, when the response term in the 
transfer list paired with a given stimulus 
term had the same pronunciation as the re- 
sponse term in the first list paired with the 
same stimulus term, the first-list association 
was appropriate for the transfer list. There 
is at least one other alternative interpreta- 
tion which might be offered to account for 
the positive transfer. The process producing 
the transfer might be said to occur only at 
the time of the transier test. Assume that S 
did not sound code the first-list response 
terms in learning the first list. When the 
transfer list was presented, S may have 
discovered in pronouncing the words that 
the sound was the same as the sound of the 
trigram in the first list if he now pronounced 
the trigram. Thus, on the transfer test, the 
sequence of association might run from the 
number stimulus to the spelled trigram to 
the pronounced trigram (for the first time) 
to spelling of the word. We have no evidence 
to contradict this interpretation. However, 
itis a complicated sequence and would seem 
unlikely to occur within à 2-second antici- 
pation interval, particularly on the early 
transfer trials, and the evidence shows 
transfer was maximal on the early trials. We 
are inclined to hold to the position that S 
sound coded the first-list responses during 
first-list learning and that this provided the 
source of the positive transfer effect. That 
we found no auxiliary support for this posi- 
tion in the error analyses may reflect, only 
the fact that the transfer effect is small and 
did not occur for all items. 

We found no evidence for positive trans- 
fer in the mixed lists; indeed, there was 
some evidence for a small interference effect. 
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It was noted earlier that the mixed lists 
could be thought of as two-rule lists (some 
items sound codeable, some not), and like 
the two-rule lists of the earlier experiments, 
ho positive effect occurred in spite of the 
fact that the evidence indicated clearly that 
coding to words was occurring in these ear- 
lier experiments. The sound codeable items 
have been associated with a particu- 


noncodeable items with a higher (or lower) 
series. But if such conceptual associations 


» we believe that 
first-list sound coding occurred, > 

We conclude that sound coding of re- 
sponse terms has been demonstrated by the 
present conditions. Yet, as in all of our ex- 
periments, the positive effect of a particular 
coding system was relatively small. 


GENERAL CONCLUSIONS 


All experiments reported show that Е-е 
i rate of 


emerge which, in the long-term attempt to 
understand verbal learning, may be more 
important than the fact that coding systems 
can be shown to influence learning posi- 
tively, 


rules with encoded concepts in the two-rule 
list of Experiment 2. When stimulus en- 
coding was involved, it appeared that unless 
the system was very simple, S did not even 
perceive the possibilities for encoding to 
words, 

2. The positive effect resulting from the 
use of simple coding schemes js Not, in any 


absolute sense, large. In this Sense, the pres- 


wood, 1959). 

3. The use of a coding system does not 
inevitably imply that 
thereby be facilitated. It became clear that 
in Experiments 1, 2, and 3 Ss in certain of 
the conditions were encoding to words, but 


clear evidence that one group of Ss was en- 
coding to words and the other not, but learn- 


Two general comments must be made in 
conclusion. As pointed out in the introduc- 
tion, our conclusions must necessarily be 
limited to the coding system manipulated by 
quite possible that other coding 
Systems were involved of which we are un- 
aware. But we can say, to repeat one of the 
above points, we know that S did in some 
instances attempt to utilize the built-in cod- 
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The second comment relates to the gen- 
erality of our findings. We do not know the 
number of different types of coding systems 
which S may naturally use when presen 
a list to learn. We therefore do not know 
how representative the manipulated cod- 
ing systems are to all coding systems. How- 
ever, it would appear that the most power- 


ful coding systems must in some way change 
the meaningfulness of the units or reduce 
the deleterious effects of similarity, for these 
in turn are the two most powe ul task vari- 
ables influencing the rate of acquisition of 
verbal tasks. The coding systems inserted in 
the lists used in the present experiments 
were relevant to these two variables. 


—— 
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A QUANTITATIVE EXTENSION 
PERCEPTION AND 


Theories of cognitive balance (with the 
baum's congruity model) assume only 


in which all relations 
Sociometric-like ratings 


a focal person's evaluation of these 
attraction to the focal other (p € 


self-esteem (p < 0005). 


R= theoretical formulations by the- 


1961), and Osgood and Tannenbaum (1955) 
focus attention upon the relative congruity 
or consistency among à person’s cognitions. 


of relations among cognitive elements as 
balanced, congruous, consistent, or conso- 
nant, and each postulates that states of 
imbalance tend to become resolved into 
аер of 

1 This paper is adapted from а 
mitted to the University of i 
in partial fulfillment of the requirements for 
the degree of Doctor of Philosophy. Abbreviated 


Clausen and Edward E. Sampson also offered 
valuable assistance. The personnel and children 


their role in making 


his evaluations of others are reciprocated by them'is & 


Lyons, & Perlmutter, 1951; Jordan, 
Kogan & Tagiuri, 
Newcomb, 1953, 


relations among cognitive 
measured dichotomously. 
and Rosenberg’s (1958) 
wright and Harary’s 
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OF HEIDER’S THEORY OF 


COGNITIVE BALANCE APPLIED TO INTERPERSONAL 


SELF-ESTEEM' 


WILLIAM M. WIEST 
Reed College 


& Tannen- 
of relations 


among elements. To eliminate this restriction, a quantitative extension of Heider's 
theory is presented that enables systematic treatment iti 
are continua 


children (Grades 5, 6, and 7) sup- 


port the quantitative model’s prediction that (a) the magnitude and sign of cor- 
relation between S's feeling toward a set of acquaintances, and his perception of 
same persons, is а linear 
and (b) the extent to 


which 8 believes 
positive function of his 


———— 


balanced states. Data from a wide variety 


orists such as Abelson and Rosenberg of settings attest to the predictive power 
(1958), Cartwright and Harary (1956), of these theories of cognitive consistency 
Festinger (1957), Heider (1946, 1958), Me- (Brehm & Cohen, 1962; Burdick & 
Guire (1960a, 19600), Newcomb (1953, Burnes, 1958; Cohen, 1960; Horowitz, 


1958; 
1958; Morrissette, 1958; 
1961; Osgood, 1960; Ro- 


While these formulations differ from one senberg, Hovland, McGuire, Abelson, & 
another both in rigor and in the types of Brehm, 1960; Runkel, 1956; Sampson. & 
situations to whieh they typically are ap- Insko, 1964). i, 

plied, they agree in emphasizing the inter- One limitation shared by most theories 
dependence among elements of a cognitive of cognitive balance is that they, are re- 
structure. Each theory defines certain sets stricted to the special case in which the 


elements are 

While Abelson 
model and. Cart- 
(1956) generalization 


Heider’s theory make it possible to de- 


fine the degree of balance of structures 
with more than three elements, these con- 
tributions leave unsolved the problem of 
how to deal with 
strength among elements. Osgood and Tan- 
nenbaum (1955), treating some relations 
among elements 
a significant initial step toward adequate 
handling of this issue. Some 
their model of attitudinal 
treated as special cases 
extension of 


relationships of different 


quantitatively, have made 
features of 
congruity can be 
of the quantitative 
Heider’s theory proposed here. 


2 WILLIAM 


Because current theories of cognitive 
balance lack a clear rationale for defining 
the degree of balance in structures whose 
elements have variable amounts of sim- 
ilarity or attraction (or other relation) to 
one another, they can not reflect accurately 
the degree of balance found in empirically 
measured cognitions. Further, as is true 
of all nonquantitative formulations, these 
theories place severe restriction on the ex- 
tent to which interaction between theory 
and data can aid in the development of 
more powerful scales of measurement. 

To obviate these limitations of nonquan- 
titative theory, a more general model of 
cognitive balance is here proposed. The 
model is designed to explicate a balance 
theory for the general case in which the 
relations among elements in the cognitive 
structures are continua rather than dichot- 
omies. The present formulation focuses pri- 
marily upon structures having no more 
than three relations among the elements. 
However, preliminary theoretical and em- 
pirical attempts by the author to generalize 
the model to a larger number of relations 
(dimensions) appear promising, 

The model is referred to as an extension 
of Heider’s theory because it derives pri- 
marily from Heider's work and adopts his 
terminology and because of Heider's ap- 
parent precedence in the explicit, statement 
of balance theory. Nevertheless, the model 
is closely related to other theories of cog- 
nitive consistency. 

Several features of Heider's (1958) the- 
ory relevant to the present formulation 
may be noted. The theory typically deals 
with cognitive structures having two or 
three elements—p, a person or perceiver; 
0, some other person; and X, an impersonal 
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Fig. 1. Balanced and imbalanced structures for the case of 
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object or thing. While various kinds of re- 
lations among elements can be defined, 
Heider distinguishes two  classes—senti- 
ment, or liking relations, L, and similarity, 
or unit relations, U. Each of the relations 
may have either a positive or a negative 
sign. For example, p can either like or dis- 
like o, p can think that o is or is not re- 
sponsible for x, etc. 

While Heider treats L and U relations 
as functionally equivalent, Cartwright and 
Harary (1956) have noted that ~I, means 
"dislikes," the opposite of “likes,” while 
~U generally is understood to mean “not 
related or similar to," the complement of 
"related to." The negation of the U relation 
implies the absence of a relation and there- 
fore might better be represented by zero 
than by a negative sign. The fact that 
negative U relations result in “vacuously” 
balanced structures (Cartwright & Harary, 
1956) is explicable in terms of the theory 
proposed here. 

The remaining discussion focuses pri- 
marily on the more general affective rela- 
tion, L. Thus, pLo represents the variable, 
p’s degree of attraction to o, pLx is p’s 
evaluation of x, and oLx is p’s perception 
of o's feeling toward x. With relations be- 
tween pairs of elements measured dichot- 
omously, the eight p-o-x configurations 
shown in Figure 1 can be generated. Those 
structures with all positive or two negative 
signs are defined as balanced, and those 
with one or three negative signs as im- 
balanced. While Heider (1958, pp. 203, 
206) feels there is some ambiguity in the 
case of three negative relations, the ra- 
tionale provided by graph theory (Cart- 
wright & Harary, 1956) justifies treating 
this case as imbalanced, 
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three elements related positively or 


negatively. (The signs between p and o and between p and x represent p's evaluation of (or more gen- 


erally, 


p's relation to) o and x, respectively, and the sign between о and x represents p’s perception of 


the relation between o and x. The three specific liking relations are referred to as pLo, pLx, and oLx.) 
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Because the theory implies that imbal- 
anced states generate forces toward resto- 
ration of balance, fewer imbalanced than 
balanced structures should be observed 
empirically. This hypothesis is strongly 
supported by Kogan and Tagiuri’s (1958) 
study of the perception of interpersonal 
preferences among members of small 
groups, a study in which another person, 
q, substituted for x in the р-0-х triad. 
Studying two balanced structures (Cases I 
and IV in Figure 1) and two imbalanced 
ones (Cases V and VI), these investigators 
found that balanced cognitive structures 
occur far in excess of chance and also in 
excess of the degree of balance of the 
actual preference network. Imbalanced 
structures, on the other hand, occur sig- 
nificantly less often than would be expected 
by chance. 


THE QUANTITATIVE BALANCE MODEL 


In the quantitative version of Heider's 
theory each relation, pLo, pLx, and oLx, 
is considered a bipolar dimension concep- 
tually independent of the other two. Each 
dimension can be identified with а quanti- 
tative scale measuring the degree of (felt 
or perceived) attraction ог antipathy be- 
iween pairs of elements. These three rela- 
tions, taken as variables, are viewed as 
defining а three-dimensional space in which 
each possible point represents à configura- 
tion of values on the three relations (see 
Figure 2). 

Heider's four balanced struetures can be 
seen on the left side of Figure 2, located 
at the four corners of the cube, I, qum ш, 
and IV. For example, the corner point 
labeled “Т,” represents that configuration 
in whieh p is highly attracted to both o 
and x and in which o is also perceived to 
have a high degree of liking for x. 

All points in the space that fall along а 
straight line joining any of Heider’s four 
balanced structures, are defined as bal- 
anced. The six possible straight lines that 
сап be defined in this manner yield а 
tetrahedron that lies within the cube. The 
exact form and position of the tetrahedron 
is easily seen by examining the right hand 
„ side of Figure 2, where the tetrahedron is 
portrayed by itself. It is assumed that all 


balanced configurations (points) are on 
the surface of or within the tetrahedron. 
However, the closer a point is to the center 
of the tetrahedron, the more does the char- 
acterization, vacuously balanced, apply; 
this is because approaching the center of 
the tetrahedron is equivalent to approach- 
ing a zero value on all the variables (i.e., 
it is equivalent to approaching the absence 
of any attraction, repulsion, or similarity 
relation among the elements in the struc- 
ture). Points outside the tetrahedron are 
imbalanced, and they are more imbalanced 
the greater their distance from the nearest 
side of the tetrahedron. Thus, the four 
corners of the cube maximally distant from 
the nearest surface of the tetrahedron are 
coordinated to the four imbalanced struc- 
tures of Figure 1. 

Of course, it is of interest to specify those 
points in the space which are balanced 
(or balanced to varying degrees) because 
oi the hypothesis that balanced states are 
relatively stable equilibria toward which 
imbalanced configurations tend to move. 
Thus, to postulate that a particular figure 
defines the limits within which all balanced 
structures lie is to postulate that empiri- 
cally measured configurations will tend to 
be contained within these boundaries. In 
short, balance theory in this context pre- 
dicts certain features of the distribution of 
points in the space. 

One method for translating these im- 
plications of the model into empirical op- 
erations yields a set of predictions about 
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Fic. 3. Hypothetical boundaries of balanced 
points at five different levels of pLx, obtained by 
“slicing” the cube of Figure 2 at five places along 
the dimension, pLx. 


the magnitude of covariation (or associa- 
tion) between any two variables as a func- 
tion of the third variable. Such an inter- 
pretation is illustrated in Figure 3, which 
shows the cube of Figure 2 “sliced” in five 
places along the dimension, pLx. The same 
“picture” would emerge if the cube were cut 
in either of the other two dimensions. 

If empirically measured configurations 
tend to lie within the boundaries hypothe- 
sized for the extended balance theory, then 
the direction and degree of correlation be- 
tween pLo and oLx should be directly related 
to the value of pLx. It is apparent that the 
model in this form can be tested by calcu- 
lating the degree of correlation? between 


*An average difference score, D, or some 
variety of difference score (cf. Osgood, Suci, & 
Tannenbaum, 1957), providing a measure of the 
correspondence between, for example, the pLo 
values and the oLx values at different levels of 
pLx, might appear to be as good a dependent 
variable as the correlation coefficient for purposes 
of testing the model. However, the D index is less 
adequate than a correlation coefficient in a number 
of respects, First, unlike a. correlation coefficient, 
the numerical value of D is not independent of 
the unit of measurement. Of more importance is 
the fact that a D index may be low (indicating 
a high degree of correspondence between pLo and 
oLx) under conditions in which a correlation co- 
efficient would indicate no relation between the 
two variables, This could happen, for example, 
if all pLo and oLx data points at a given level 
of pLx fall in the same (or very small) area, im- 
plying no (or little) variance on either dimension. 
While this state of affairs might be interpreted 
as indicating a high degree of cognitive balance, 
it is more parsimoniously viewed as a manifesta- 
tion of a simple response set. Such an effect, that 
18, mày or may not be determined by tendencies 
toward cognitive balance. In contrast, a correla- 
tion (literally understood аз covariation) cannot 
be explained on the basis of simple response 
Stereotypy and is therefore better evidence for 
the operation of а higher order set implied by the 
term "cognitive balance." 
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any two variables with the third held 
constant at a number of different levels. 
Thus, the predictions of the model can be 
stated formally as a set of monotonic func- 
tional relations, of which only Case 3 is 
shown graphically in Figure 3; 


1. rpro, рх = J(oLx), 
2. TpLx, ох = f(pLo), 
3. ToLo, ох = f(pLx). 


In verbal translation the notation implies 
that the extent to which the intensity of 
any two relations covaries is a function of 
the third relation. For example (Case 3), 
the more intensely p feels toward x, the 
greater will be the covariation between his 
feelings toward o and his perception of o's 
feelings toward x. 

The predictions can be stated more rigor- 
ously, if the measuring instruments war- 
rant, as a set of corresponding linear equa- 
tions; 


1. fpro, prx = k + w(oLx), 
2. Tprx, orx = k + w(pLo), 
3. TpLo, орх = k T w(pLx), 


where k = some constant (k — 0 if the 
implications of the model are taken liter- 
ally) and w = a weight whose value de- 
pends upon the scale units of the variable 
being weighted (eg. w = 1 if +1 repre- 


sents the maximum amount of liking and , 


—1 represents maximum disliking). Stated 
verbally and with some loss in precision, 
several implications of Equation 3, for 
example, are: (a) The more p likes x, the 
greater will be the degree of positive cor- 
relation between his liking for o and his 
perception of o's liking for x. (b) The more 
D dislikes x, the greater will be the degree 
of negative correlation between his liking 
for o and his perception of o's liking for x. 
(c) If p feels indifferent toward (neither 
likes nor dislikes) x, there will be no re- 
lation between his liking for x and his per- 
ception of o's liking for x. 

The more rigorous version of the model 
(as a set of linear equations) can be tested 
adequately only if rather strong assump- 
tions are made about the properties of the 


4 


‚ and oLx, considered separately : 
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measuring instruments. For example, test- 
ing the prediction of the theory regarding 
the elevation of the eurve (value of k) in 
Equation 1, requires at least that the oLx 
scale have a true zero. Likewise, testing 
the theory with respect to the slope of the 
curve (value of w) in Equation 1, requires 
that the oLx seale have equal intervals and 
that its end points define maximum degrees 
of liking and disliking. Whether these as- 
sumptions are met by scales currently 
available is discussed more fully in an en- 
suing section. 

Another way of testing the fit of the 
model to empirical data does not depend 
upon determining degrees of correlation. A 
more detailed statement of assumptions re- 
garding the properties of the “tetrahedron 
model” would make it possible to construct 
a procedure for indexing the degree of 
balance of each possible point in the three- 
dimensional space. Such an index of the 
degree of balance of a structure could be 
given a relative frequency (likelihood) in- 
terpretation as suggested by Cartwright 


„ and Harary (1956), an interpretation in 


accordance with the assumption that im- 
balanced points are avoided and thus oc- 
cur less frequently than balanced ones. If 
such a formula were constructed, and a set 
of operations carried out to associate а 
number iwith each possible point (or with 
each ofa finite set of subspaces in the 
cube), the model could be tested by noting 
the amount of correspondence between the- 


. oretical and observed frequencies in the 


three-dimensional subspaces. It should be 
noted, however, that given the model alone, 
there does not exist а procedure for de- 
termining the theoretical frequencies in the 
subspaces. The model itself specifies noth- 
ing about the frequency distributions of 
the three constituent dimensions, pLo, pLx, 
That is, 
the model does not predict how much p 
likes o unless the relations between p and 
x and between o and x are given. Therefore, 
the theoretical frequencies are contingent 
upon the shape of the joint (bivariate) 
distributions of at least two of the vari- 
ables. Given one of the three possible bi- 
„ variate distributions, а frequency “ех- 
pected” on the basis of the quantitative 


balance model could be assigned to each 
subspace in the cube. 

This particular method of testing the 
model would provide the same information 
about the predictive utility of the model 
as that given by the earlier presented 
method based on covariation of any two 
variables at different levels of the third. 
The latter method based on covariation 
appears to be more easily carried out in 
practice and also embodies the possibility 
of stating the model as a set of linear 
equations (or monotonic functions), modes 
of representation that provide a readily 
understandable description of the fit of the 
data to the model. 

Some aspects of the model just described 
may be compared with the theory of atti- 
tudinal congruity proposed by Osgood and 
‘Tannenbaum (1955). These theorists have 
introduced quantification into their model; 
however, they have quantified only two 
of the three relations in the three-element 
configuration. In Osgood's (1960) system 
of representing а simple cognitive structure, 
the signs attached to о and x represent à 
person's attitude toward the person ог ob- 


ject, o and x, as measured on 
T 
@———_ 9 
о х 


the evaluative scales of the Semantic Differ- 
ential (Osgood, Suci, & Tannenbaum, 1957). 
These evaluative measurements can vary 
from some negative number to some posi- 
tive number. The sign of the line between 
o and x indicates whether the two elements 
are perceived to be closely related to one 
another or in opposition (or in the special 
case they discuss, whether a source, 0, is 
believed to favor or to oppose an idea or 
object, X). The latter relation itself, how- 
ever, is treated as а simple dichotomy ; it 
is either positive (associative) or negative 
(dissociative). On this basis Osgood and 
Tannenbaum propose that attitudes toward 
objects are congruent (a) if they are 
equally polarized in the same evaluative 
direction when the objects are related by 
positive assertions or (b) if they are equally 
polarized in opposite evaluative directions 
when they are perceived to be in opposition. 
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How would these definitions of congruity 
be affected by varying the strength of the 
relation between o and x? A plausible 
answer is that decreasing the strength either 
of a positive or negative assertion relating 
о and x could only make the definition of 
congruity less certain or more indetermi- 
nate. Such a conclusion is suggested by the 
tetrahedron model. Perhaps Osgood and 
Tannenbaum have deliberately chosen to 
restriet consideration to the special case 
where at least one relation is of maximum 
strength to avoid dealing with cases in 
which predietions are less determinate. 

A suggestion from the work of Osgood 
and Tannenbaum may also be relevant to 
the tetrahedron model. Within the context 
of Osgood's (1952) theory of meaning, it 
is concluded that there is a tendency for 
attitudes toward (evaluations of) objects 
to become polarized: That is, there is a 
continuing pressure toward polarization 
since extreme, “all or nothing,” “black and 
white” judgments are simpler than more 
refined ones. Applying this reasoning to the 
tetrahedron model, one might hypothesize 
forces directed away from the center of the 
cube and toward its sides. This hypothesis 
implies that attitudes or cognitions that are 
unrelated to one another (forming vacu- 
ously balanced structures) should be rela- 
tively rare, a phenomenon also implied by 
Harary’s (1959) postulated “tendency to- 
ward completeness.” While forces directed 
away from the center of the cube would 
tend to make cognitive elements relevant 
to one another, it is the tendency toward 
cognitive balance that determines the form 
of the relations among these elements. In 
short, two classes of forces might be postu- 
lated, one tending to keep points way from 
the center of the cube (Osgood & Tannen- 
baum’s pressure toward polarization) and 
the other (tendencies toward balance) 
tending to limit the distribution of points 
to certain (balanced) positions along the 
outer surface of the cube. 


APPLICATION OF THE Моркт, то 
Two BEHAVIOR DOMAINS 


While the quantitative version of Hei- 
der's balance theory is applieable to a wide 
variety of social psychological problems, 


the empirical studies reported here consti- 
tute essential first steps in providing direct 
and easily interpretable tests of the model, 

The quantitative balance model was 
tested in two related behavior domains: 
(a) perceived interpersonal attraction 
among variously evaluated peers (Study 
I) and (b) perceived reciprocation of liking * 
by others toward the self under various 
levels of self-esteem (Study II). 

Study I tested the model in the context 
of p-o-q configurations (where another 
person, ч, is substituted for x in Heider’s 
p-o-x triad). The study has features in 
common with previous research (e.g., 
Kogan & Tagiuri, 1958; Tagiuri, 1958) on 
the perception of interpersonal preference, 
with the addition of variable strengths of 
L relations. The specific hypothesis tested 
is that there is a gradient in the degree of 
correlation between a person’s own evalu- 
ative ratings of a set of others and the 
liking he perceives each of these others to 
feel toward (or to “receive from”) various 
focal persons—the gradient being a func- 
tion of the person’s attraction to the focal 
person. The hypothesis can be stated as a 
simple monotonic function, тр, ora = 
f(pLq), or as a linear equation, то , ota = 
k + w(pLq), where k and w are defined 
as for Equations 1, 2, and 3 in a preceding 
section. In other words, the extent to which 
a person’s (p’s) liking for a set of others 
(o's) is correlated with the amount of 
liking he perceives these others (o's) to feel 
toward some focal person (q), depends 
upon how much he (p) likes the focal per- 
son (q). 

Study II tested the model for p-0-s con- 
figurations where s, the perceiver's self, is 
considered an element in the cognitive 
structure. Thus, Study II examined Hei- 
der's (1958, p. 210) assertion that his 
definitions of balanced and imbalanced ^ 
structures require the assumption that p 
likes s (himself): That is, that the per- 
ceiver has high self-esteem. Research by 
Broxton (1963), Deutsch and Solomon 
(1959), Lundy (1956), Pilisuk (1962), and 
Secord, Backman, and Eachus ( 1964) lends 
support to Heider’s suggestion that a per- 
son’s attitude toward himself is highly rel- + 
evant in the context of balance theory. 


' 
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In terms of Heider’s theory one can de- 
seribe four balanced and four imbalanced 
p-o-s triads, analogous to the eight struc- 
tures listed in Figure 1. The structure 
corresponding to Case I in Figure 1 is in- 
terpreted as follows: If p has high self- 
esteem, he should like those whom he sees 
as liking himself. Case III means, if р likes 
himself he tends to dislike o if he thinks 
he is disliked by o. If p genuinely dislikes 
himself he should like those whom he sees 
as disliking him (Case II) and dislike those 
he thinks like him (Case IV). 

Whether these generalizations are valid, 
of course, can be determined only if there 
are persons who truly meet the condition, 
p dislikes s. It is likely, however, that cur- 
rently measured low self-esteem implies а 
great deal of ambivalence about the self 
rather than genuine dislike. Also, the spe- 
cific relation, p dislikes s, is rare or at least 
difficult to conceive, in part because this 
relation is imbalanced. A structure with the 
relation, p dislikes s, is always imbalanced 
jf it is assumed that the unit (belonging) 
relation between p and s is positive. 

However, a test of the general hypothe- 
sized relationship between p’s degree of 
liking for s and the correlation (correspond- 
ence) between pLo and oLs—a relation 
derived from the quantitative extension of 
Heider’s theory—does not require persons 
who genuinely dislike themselves; the only 
requirement is to be able to identify reli- 
ably different degrees of pLs. 

In spite of the fact that no fully satis- 
factory conceptualizations of self-esteem 
are available, nor completely adequate 
measuring instruments (Wylie, 1961), 
Study II used а self-esteem scale to meas- 
ure the dimension, pLs. Although it is not 
feasible to locate an indifference point on 
a scale of self-esteem nor even to expect 
that persons scoring low in self-esteem 
genuinely dislike themselves, it is reason- 
able to assume that low scorers have less 
net positive feelings about themselves than 
do high scorers. Thus, the specific hypothe- 
sis of Study П is that the degree of cor- 
relation between à person's liking for var- 
ious others and his perception of how 


+ much they like him, varies positively with 


the person's level of self-esteem. This pre- 


diction ean be represented as the monotonic 
function, Tpro, ыл = f(pLs), or as the 
linear equation, Tpto , et» = k + w(pLs), 
where k and w are defined as in an earlier 
section.* 


Method 


Both Study I and Study II were conducted 
with subjects (8в)* in elementary and junior high 

*Several general statements identify issues 
that intentionally are not dealt with in these 
studies. First, while it is possible to examine the 
actual network of choices (liking relations) among 
individuals to discover the degree of balance of 
such social structures (Cartwright & Harary, 
1956; Neweomb, 1953, 1961), this is clearly a dif- 
ferent problem from that of ascertaining the de- 
gree of balance of these structures as perceived by 
individuals. Study I and Study II concentrate on 
the perceived or cognized relations among persons. 
Second, it is possible to view the degree of simi- 
larity between perceived and actual social struc- 
tures as defining the “accuracy of social percep- 
tion.” But as Tagiuri, Blake, and Bruner (1953) 
and Cronbach (1955, 1958) have noted, the in- 
terpretation of such derived measures 0 accuracy 
is fraught with great difficulty. In short, the two 
issues, degree of balance of the actual social system 
and veridicality in social perception are not 
treated further here. Third, neither study at- 
tempts to explicate causal relations. among the 
behavioral measures studied. The lack of concern 
with problems of cause and effect in the data is 
not a matter of inadvertence. Balance theory 
implies only that cognitive structures are stable 
systems in which a change in any part may pro- 
duce changes in other parts. A test of the theory 


ticular causal sequence is typical. For example, 
balance theory is noncommittal on the question 


give rise to perceived agreement. Experimental 
studies (Backman & Secord, 1959; also see Rosen- 
berg, 1960, in a more general context) support the 


studies examine the degree of balance in naturally 
occurring, nonmanipulated cognitive structures, 
and therefore have no bearing on the priority of 
liking versus perceived agreement. Finally, the 
current studies bypass questions of the causal 
origin of the tendency toward balance. The author 
believes, however, that it may be profitable to 
examine the importance of such antecedent varia- 
bles as social reinforcement, history, variety of 
experience, and the current contingencies of rein- 
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school.:the data consisting of self-esteem scores 
and sociometric-like ratings of class members of 
the same sex. Each S was asked to indicate his 
degree of attraction for every other person in his 
group, his perception of the feelings of these 
persons toward himself, and his perception of the 
liking relations among selected other classmates. 
The latter ratings were obtained via two modes, 
each S responding under one mode only. Mode 1 
required judgments about the attitudes of three 
focal -persons (someone p likes, feels indifferent 
toward, and dislikes) toward all of the other class- 
mates, Mode 2 reversed the task, requiring judg- 
ments of how everyone else in the class feels 
toward the three focal persons—a well-liked, а 
neutral and a disliked other. Within each sex 
and grade level approximately one half of the 
Ss performed under Mode 1 and one half under 
Mode 2. 

Each S's self-esteem was assessed both by а 
self-deseriptive instrument and by teacher's rat- 
ings of certain behaviors, A 

Subjects. A total of 415 Ss (197 boys and 218 
girls) from 14 classrooms was drawn Írom Grades 
five, six, and seven. The classes were obtained 
from ‘five different public schools, representing 
considerable variation in socioeconomic status. 
Each élassroom contributed approximately 30 Ss; 
however, because the average enrollment in each 
class was 34 (range 30-38) the average boy or 
girl rated approximately 16 classmates (the av- 
erage number of persons of the same sex in the 
group; 17, minus 1, the child himself). 

Variables and Instruments. Four Separate sets 
of responses were taken in Parts A, B, C, and D 
of a questionnaire. Part A, designed to measure 
each S's evaluation of all other classmates—pLo*— 
confronted each child with a list of the names of 
all classmates of the same sex and a 7-point rating 
scale' with’ two extreme end points defined as “like 
very müch" and "dislike very much." The mid- 
point of the scale was defined as ^no feeling" and 
"neither like nor dislike." The other 4 points on 
the scale were also verbally defined. 

Part:B, a measure of each S's perception of the 
degree: to which he is liked by each relevant 
classmate—oLs—consisted of the same list of 
names and a similar rating scale phrased to evoke 
the appropriate judgments. 

Part С, 
Coopersmith, 1959) provided one measure of pLs, 
the degree of liking for the self. A second index 
of self-esteem was a 10-item Behavior 


“defensively” high scorers from thos with gen- 
uinely high self-esteem. The former, according to 
Coopersmith (1959, p. 87), should de scribe them- 
selves in quite favorable terms while their ob. 
servable behavior would be indicative of low 
self-esteem. 

Part D, a measure of oL«, the degree of per- 
ceived attraction among various other classmates, 


was composed of the same list of classmates and 
а rating scale phrased to evoke the rater's judg- 
ment of the liking relation among various other 
persons, o's and q's. To restrict the number of 
judgments required of S, three focal persons for 
each S were selected by the experimenter (E) 
to represent widely disparate values on pLo. 
Which three classmates were selected as focal 
persons for a given S depended only on that S's 
pattern of ratings; thus, the focal persons were 
not the same individuals for all Ss. (‘This procedure 
resulted in the average child's having to make 
3 X 15, or 45, judgments rather than the complete 
set of all possible judgments, 16 x 15, or 240.) 
Ап attempt was made to select as one focal 
person someone whom the child had rated +3 
(like very much), someone to whom he had as- 
signed a zero (neutral), and someone whom he had 
rated —3 (dislike very much). Because not all 
children used the entire range of the rating scale 
in Part A, E selected one classmate А had rated 
highest (H), one he had rated lowest (L), and 
one representing an intermediate (M) value in 
S's distribution of evaluative ratings.° 

The two modes of making oLq judgments in 
Part D depended upon whether the three focal 
Persons were treated as o's (the perceived source 
of affect—from whom various degrees of liking 
are perceived to radiate) or as q’s (the perceived 
object of liking—toward whom the affective rat- 
ings of others are perceived to converge). All 
judgments were made about how o feels toward 
q, and not vice versa. Thus, Mode 1 required 
each child, p, to judge how he thought each of the 
three selected focal persons, oz, ом, and он, felt 
toward the remaining class members, the q's. 
Mode 2 reversed the task for p; he was required 
to judge how he thought the remaining class 
members, o's, felt toward each selected focal per- 
son, qz , qu , and qu. 

Derived Measures and Analytic Procedures. 
Because the predictions obtained from the tetra- 
hedron model refer to the structure of cognitions 
within an individual, four intraindividual correla- 
tion coefficients (Pearson r) were calculated by 
electronic computer for each S (three coefficients 
for Study I and one for Study II). Each co- 
efficient, calculated to three digits, was trans- 
formed to Fisher’s z, (Edwards, 1950) for the 
statistical analysis. 

The correlation coefficients obtained for Study 
I are as follows. For each 8, the degree of correla- 


"The three focal persons for each child are 
labeled ox, ом, oz (or qu, Чи, qz) respectively, 
for the classmate given a high, medium, and low 
rating by S on the evaluative scale of Part A. 
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| tion was computed between the ratings he assigned 
{о classmates in Part A and his three sets of rat- 
ings in Part D—those with respect to а person 
Low, Medium, and High on pLo. The three re- 
sulting coefficients are designated тар, Tax, and тан, 
respectively. 

For Study Il, only one correlation coefficient 
was determined for each S. Ratings assigned in 
Part A, indicating degree of liking for each other 
classmate, were correlated with ratings in Part B 
in which S judged the degree of liking of each 
other person for him. The resulting coefficient, 
tan, should be positively related to self-esteem 
according to the quantitative balance model. 


Results 


Study I. A simple Subjects X Conditions 
analysis of variance (Edwards, 1950, рр. 
204-302) was performed on the z, trans- 
formed correlation coefficients, Tar, law» 
and ray, for the entire sample of 415 Ss, 
yielding an Ё ratio of 190 with 2/414 df, 
p < .0001. Because the design involved 
“repeated measurements,” the mean square 
for Subjects X Conditions interaction was 
the error term used to test the significance 
of the main effect for Conditions—levels 
of pLo(q) (Edwards, 1950; MeNemar, 1955; 
Walker & Lev, 1953). The weak form of 
the hypothesis of Study I, that the mean 7 
should be an increasing monotonic function 
of the level of pLo, is strongly supported 
by the overall analysis of variance as well 
as by analyses performed separately for 
each of the 28 groups. While the Ё ratios 
of five groups fail to attain significance at 
the .05 level, four of these have p values 
between .05 and .10 and one has à P value 
between .10 and .20. Of the remaining 23 
groups showing à levels effect significant 
at less than the .05 level, 9 are well beyond 
the .001 level, 5 beyond the 005 level, 4 
beyond the .01 level, and 4 beyond the 025 
level. All but three groups show the pre- 
dicted trend: far < Tam F. 
these involves а single reversal such that 
Faz is slightly larger (from 007 to .032) than 
TAM * 

A. graphical representation of the highly 
significant total levels effect for all Ss 18 
shown in Figure 4, giving evidence of а 
nearly linear trend from Fı, to Fas. The 
points are plotted to represent (along the 
+ abscissa) the average S's pLo scale values 
for the three selected focal persons. These 
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average values, varying only slightly across 
groups and between sexes and modes, were 
—.96, .93, and 2.92. In other words, the 
average level of liking for the three focal 
persons Oz, Ом, and Or, approximated а 
—1, +1, +3 pattern rather than the de- 
sired —3, 0, +3 pattern—the result of a 
tendency for most Ss to give a greater 
number of positive than negative ratings. 
For comparison, the broken line in Figure 4 
shows the results for a subset of Ss (N = 
56) whose ratings of the three focal others 
on the pLo scale conformed to the "ideal" 
pattern, —3, 0, and +3. 

Very decisive support for the weak ver- 
sion of the hypothesis of Study I can be 
claimed. However, if the hypothesis is 
stated more stringently as а linear equation, 
it is apparent that the elevation of the curve 
is greater and the slope of the curve is less 
than the balance model alone would pre- 
dict. These inferences concerning the fit of 
data to model are not strictly proper, how- 
ever, unless one makes rather strong assump- 
tions about the pLo scale—assumptions 
that are detailed in an ensuing section. 

Comparison of the results for all combina- 
tions of sex and mode revealed a substantial 
difference between Mode 1 and Mode 2 in 
the magnitude of F (t = 4.65, df = 413, 
p< 10001) but almost identical results for 
the two sexes (t = 23, df = 413). Inspection 
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Fic. 4. Mean value of the dependent variable, 
r (determined by calculating mean Zr, then trans- 
forming to 7) for all subjects (and for —3, 0, +8 
subjects) at each of three levels of pLo(q). 
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Fic. 5. Mean value of the dependent variable, 
r, for Mode 1 and Mode 2 at each of three levels 
of pLo(q). 


revealed no slope differences between the 
sexes nor between the two modes. Figure 5 
dramatizes the extent to which the curves 
for the two modes have a similar slope but 
a different elevation. 

Study II. The hypothesis of Study II as 
derived from the extended balance model 
is that there is a positive correlation between 
S’s self-esteem and the measure of con- 
gruency, 2,,,. The latter expresses the 
Fisher transformation of the correlation 
coefficient between 85 liking for a set of 
others (Part A ratings) and the extent to 
Which he perceives his feelings to be recipro- 
cated by these other persons (Part B rat- 
ings). 

The two indexes of self-esteem, Cooper- 
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smith’s (1959) SEI and BRF, are signifi- 
cantly (p < .001) related to each other 
(with some nonsignificant variation in this 
relation across groups). Because the correla- 
tion coefficient between the two scales is 
only approximately 40 and because both 
seales have considerably higher reliability 
coefficients than .40 (cf. Coopersmith, 1959), 
it appears that each scale measures a some- 
what different aspect of self-esteem. Such an 
inference reinforced the view that the two 
scales have potential value as joint predictors 
of congruency. However, before employing 
SEI and BRF as joint predictors of con- 
gruency, their separate correlations with the 
congruency index are examined. 

The degree of correlation between congru- 
ency and SEI was calculated for each of the 
28 groups. The variation in these coefficients 
over the 28 groups was not sufficient to re- 
ject the hypothesis that the coefficients are 
from a single population. The mean of the 
correlation coefficients calculated separately 
within each group are extremely close to 
those obtained by calculating one coefficient 
for all boys, one for all girls, and finally one 
for the entire sample as a single group. There- 
fore, only the latter are reported in Table 1. 
All correlations between SEI and congru- 
ency are highly significant (p < :0005) for 
boys, girls, and all Ss. Visual inspection of 
each scatter plot revealed no evidence of a 
eurvilinear relation between SEI and con- 
gruency. Although the correlation between 
SEI and congruency is not of large magni- 
tude, it is in the direction predicted by the 
extension of Heider's theory and is highly 


TABLE 1 


CORRELATION COEFFICIENTS BETWEEN CONGRUENCY* 


AND Two MEASURES or SELF-ESTEEM 


TAKEN SINGLY AND IN COMBINATION 
Index of self-esteem Bove ork Ва 
Nb r Nb r Nb r 
SEI 196 .240 218 ‚222 414 .225 
BRF : 170 .245 186 .077 356 .131 
SEI and BRF (Multiple R) 170 .292 186 .224 356 ‚284 
SEI + BRF for /Z&gt — Яввв/ < 5 63 .244 67 .253 130 .226 


* Congruency = frag = 


A ratings) and oLs (Part B ratings). 


Fisher transformation of Pearson correlation coefficient between pLo (Part 


^ Unequal Ns derive from lack of BRF scores in two School elasses and from nature of selection 


procedure for last index. 
°Z = 50 + 102 = standard score. 
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significant for the entire group ( = 44, df = 
414, p < .0005). 

Table 1 shows also that BRF is correlated 
significantly with congruency (p = 0075), 
but this relation is accounted for mostly by 
the boys. The sex difference in the correla- 
tion with BRF is highly significant with a 
two-tailed test (p < .001), the correlation 
between BRF and congruency for the boys 
being as high as that between SEI and con- 
gruency, whereas for the girls it is virtually 
zero. The three mean correlations obtained 
by combining (transformed) correlations for 
all boys’ groups, for all girls’ groups, and for 
all groups of both sexes, yielded estimates of 
r nearly identical with those given in Table 1. 
Combining the two measures, SEI and 
BRF, in a multiple-regression equation 
yielded multiple Rs for boys, girls, and the 
entire sample barely distinguishable from the 
corresponding correlations between Con- 
gruency and SEI alone. The relative weights 
or partial regression coefficients expressed in 
standard score form (Walker & Lev, 1953, 
p. 319) of SEI and BRF in predicting con- 
gruency reflect the already noted sex differ- 
ence: That is, for boys the two coefficients 
are nearly equal (167 and .180 for SEI and 
BRF, respectively) while for girls, the two 
coefficients are quite different (.230 and 
—.033 for SEI and BRF, respectively). 
Another method of combining the two in- 
dexes to predict congruency involved sum- 
ming the two standard scores for those Ss 
whose SEI and BRF scores were not more 
than one-half standard deviation apart, à 
method closely resembling one recommended 
by Coopersmith (1959). The method as- 
sumes that Ss whose SEI and BRF scores 
are in agreement best represent different 
positions on à dimension of genuine self-es- 
teem: That is, such a selection procedure 
should exclude both the “defensively” high 


© scorers as well as those whose habitual verbal 


self-descriptions reflect less favorable con- 
ceptions of themselves than their observable 
school behavior indicates. 

For the 130 Ss whose selt-deseriptions 
agreed with the teacher's ratings of them, 
the correlation between congruency and 
(Zeer + Zprr) is 226, 8 significant value 


, (p € 1005) but one involving no improve- 
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ment in predietion of congruency over that 
provided by SEI alone. The scatter plot was 
inspected visually on the chance that there 
might be an unexpected curvilinear relation 
between congruency and this “refined” index 
of self-esteem. No such relation appeared. 


Discussion 


Study I 


For Study I the model’s prediction was 
stated both as an increasing monotonic 
function and as a linear equation, the latter 
statement requiring “stronger” assumptions 
about the properties of the measuring scales. 
These two ways of representing the model 
are designated the “weak” and the “strong” 
versions, respectively. 

The Balance Theory As a Monotonic 
Function. The results of Study I decisively 
support the weak version of the extension of 
Heider’s theory. The expected trend in the 
magnitude of correlation coefficients across 
different levels of pLo is an outstanding 
feature of the results for all groups com- 
bined (as well as being found rather con- 
sistently within the groups considered 
separately) and the statistical significance 
of the results leaves little doubt about the 
reliability of this finding. In concrete terms, 
the more a person, p; likes someone, о(9), 
the more positive is the correlation between 
p's evaluation of a set of other persons and 
p's perception of o's feelings toward the 
same set of other persons (or p's perception 
of the feelings of this set of others toward 
q). In other words, a person's cognitions of 
the relations among others (when these 
relations are measured on at least an ordinal 
scale) tend toward the balanced or con- 
sistent organization implied by the quan- 
titative model. 

The Balance Theory Аз a Linear Equa- 
tion. The extent to which the results accord 
with the strong version of the model is 
more difficult to determine; it depends not 
only upon the obtained results but also 
upon whether one can assume that the pLo 
scale has at least the following properties: 
a zero representing а true affective indiffer- 
ence point and polar extremes (+3 and 
—3) representing maximum degrees of lik- 
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ing and disliking, respectively. On the face 
of it, the pLo scale is compatible with these 
requirements. That is, every position on 
the 7-point scale is given a straightforward 
and explicit verbal definition in terms as 
unambiguous as possible. 

Тһе fact that the curve in Figure 4 for 
all Ss appears to be linear and exhibits a 
slope significantly greater than zero can be 
interpreted as confirmation of the model's 
predietions. But if one applies a more 
rigorous test, the fact that the slope is not 
80 great as it is predicted to be can be taken 
as evidence against the model. That is, if 
the quantitative balance model is assumed 
to predict that all cognitive structures are 
balanced,’ then these data might be taken 
to provide another "exception to the rule" 
that Zajone (1960) suggests may become 
increasingly characteristic of research find- 


* The reader is reminded that in the context of 
the present three-dimensional quantitative model, 
any cognitive structure lying within the tetra- 
hedron of Figure 2 is defined as “balanced.” This 
definition, of course, calls many more structures 
balanced than would a strict adherence to Heider's 
rule as it is interpreted in Figure 2. To the extent 
that balance, in this sense, characterizes S's entire 
set of measured triadic structures, the data would 
fit the equations given earlier. Any deviation from 
the predicted slope of the line (cf, equations on 
p. 4) could indicate that at least some cognitive 
structures are imbalanced, Specification of which 
particular triadic structures are imbalanced and 
to what degree would require separate examina- 
tion of each structure, If one wishes to treat the 
entire set of cognitions and/or feelings expressed 
by S in a given content area as a single cognitive 
structure, then the slope of the regression line 
(the value of w in equation 3, p. 4) or the co- 
efficient of correlation between pro, ora and ра 
сап be interpreted as an index of the degree of 
balance of S's cognitions in that domain. 

Trom a somewhat different perspective, one 
might also regard this correlation (balance) co- 
efficient as an index of the reliability (internal 
consistency and precision) of a speaker's verbal 
behavior. Such a balance coefficient might be use- 
ful to the psycholinguist in describing the course 
of a child's acquisition of language or in describ- 
ing differences among various language groups. 
Presumably, only a trained logician, taking pains 
to edit his verbal behavior in accordance with 
the rules of formal logic could obtain a balance 
coefficient of +1.00. A child learning to speak 
might display a balance coefficient of only slightly 
greater than zero, while a schizophrenic exhibiting 
E uy Speech" might have a negative coeffici- 
ent. -- 


ings on consistency theory. In his com- 
parison of attitudinal consistency theories 
(“human nature avoids inconsistency, im- 
balance, or dissonance”) with the earlier 
concept of vacuum in physical science 
(“nature abhors a vacuum”). Zajone (1960) 
notes that in both eases the principle sys- 
tematically accounts for many phenomena 
but that there are too many exceptions to 
consider it a theoretically useful generaliza- 
tion. In the present instance, however, it is 
not completely clear that the discrepancy 
between predicted and obtained slope of the 
curve in Figure 4 should be interpreted as 
an “exception” to the prediction of balance 
theory. It is quite possible, for example, 
that the results obtained are precisely those 
to be predicted by balance theory if in- 
formation were available about some other 
variable in the situation. Such a possibility 
does not imply that the quantitative 
balance theory is so flexible that it can 
apparently “account for” any data, no 
matter what their pattern. Instead, the 
theory suggests the kinds of variables that 
should be examined for their possible rele- 
vance in improving the fit of any given 
set of data to the model. For example, it 
can easily be shown that the correlational 
data presented in Figure 4 would more 
closely accord with the model's predictions 
if it were known that the data had been ob- 
tained under a special condition char- 
acterized by a less than maximal value on 
some fourth relevant variable. More spe- 
cifically, it is possible that among the young 
Ss studied (ages 11-15), peers of the same 
Sex are simply not very important or 
salient as objects of expressed and perceived 
affect. Indeed, if one adds to the three 
variables in the structure, a fourth, per- 
ceived importance of interpersonal rela- 
tions, it ean be shown that the predictions 
of the tetrahedron model are only special 
cases that assume maximally high values 
on а scale of perceived importance (as well 
as on all other relevant variables). If 
perceived importance is minimal (zero), 
the predicted slope of the curve would be 
zero rather than +1. In general, the higher 
the perceived importance, the closer would 
one expect the data to approximate the 


¢ 
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redictions of the simple three-dimensional 
tetrahedron model. 

There is probably no reason why the 
quantitative balance model cannot be ex- 
tended to accommodate any finite number 
oí variables. Perceived importance is offered 
here as only one of several possible ex- 
planatory variables; self-esteem is another 
variable that might act as а moderator 
variable in the same Way. Whether incor- 
porating these variables or any others into 
the model actually improves the corre- 
spondence between data and theory must, 
of course, be determined empirically. The 
question would best be answered by direct 
experimental manipulation of those vari- 
ables assumed to be relevant. 

The possibility that measurement error 
is one prineipal source of the less-than- 
predicted slope of the curves in Figure 4 
should also not be overlooked. In other 
words, the evaluative ratings may be in- 
herently unreliable or may be unstable over 
time; the possibility that the depressed cor- 
relations resulted from reliable fluctuations 
over time in degree of liking is especially 
noteworthy since up to 7 days elapsed be- 
tween the ratings of Parts A, B, and C and 
those of Part D. To what extent the lack of 
perfect cognitive consistency implied by 
the obtained slope of the curves should be 
interpreted as the result of some irreducible 
measurement error or as the result of failure 
to consider other identifiable relevant vari- 
ables in the structure, remains an important 
future research. 

A quite different way to account for the 
smaller-than-expected slope of the obtained 
curve invokes possible weaknesses (other 
than unreliability) of the pLo seale itself. 
For example, it may be that the scale does 
not discriminate finely enough nor extend 
far enough on its positive end—a criticism 
less likely relevant for the negative end of 
the scale because few extremely negative 
ratings oceurred. That most Ss manifested 
a favorability response set in rating their 
classmates would tend to reduce the scale’s 
effectiveness in discriminating extremely 
well-liked from moderately well-liked class- 
mates. Those Ss who did distribute their 
yatings over the entire scale apparently re- 


A QUANTITATIVE Тиковү or COGNITIYE Batanc® 


13 


served the highly positive end of the scale 
for those they truly like very much. Such an 
interpretation is supported by the fact that 
Fay 18 considerably higher for the —3, 0 +3 
Ss than for the total sample at the “same” 
highly positive value of pLo(q) (of. Fig- 
ure 4). 

A second feature of the data of Study lis 
the greater elevation (larger value of k) 
of the curve than the model predicts. The 
upward elevation of the curve means that, 
given the slope of the line, the correlations 
are more positive than expected. 

The fact that far is near zero rather than 
highly negative indieates that persons at- 
tribute merely different rather than opposite 
evaluations to class members they dislike 
least). Only very slightly 
opposing (negatively correlated) evalua- 
to classmates disliked 
very much (i.e, assigned & value of —3 
whereas the model 
“indifferent” 
person toward others (or toward an “in- 
different” person by others) are perceived 
to be unrelated to a person’s own feelings 
toward these others, they in fact are per- 
ceived to be positively related to his own 
of others. 


assigned a zero (indifference) rating on the 

Harary (1959) has described the above- 
noted phenomenon as à “tendency toward 
positivity." Rosenberg and Abelson's (1960) 
concept of a "force to maximize potential 
gain and minimize potential loss" as well 
as what McGuire (1960) has labeled “wish- 
ful thinking” are also analogous tendencies 
that sometimes are opposed to balance and 
logical thinking, respectively. 

Perhaps the “positivity effect,” as Te- 
flected in the correlations, is & special case 
of the general tendency to assume simi- 
larity with others (Cronbach, 1955, 1958; 
Fiedler, 1958; Fiedler, Warrington, 
Blaisdell, 1952). According to this view, the 
tendency to assume that others are similai 
to oneself carries over to situations i1 
which one is trying to guess how а dislike 
person (or someone neither liked nor dis 
liked) evaluates а set of others. This tend 
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ency makes it difficult consistently to take 
the role of a disliked (and presumably dis- 
similar) other person. 

An explanation of the heightened eleva- 
tion of the curve is possible also in terms 
of what DeSoto (1961) has called a “pre- 
dilection for single orderings.” This account 
stems from the common sense idea that it is 
simpler to order objects on only one dimen- 
sion than to order them in many ways. A 
greater amount of effort presumably is re- 
quired to construct, to learn, or to remember 
a set of multiple orderings than a single 
ordering of a set of objects. In the specific 
ease at hand, while S is attempting to order 
persons (actually, to rate them) as someone 
else would, his own ordering of the persons 
continually intrudes and presumably dis- 
torts the attempt to order from another per- 
son’s point of view. 

Another possible determinant of the 
greater-than-predicted elevation of the 
curve is that the zero point on the pLo scale 
may not truly represent psychological in- 
difference. It has already been noted that 
Ss do not tend to use the negative end of 
the scale and that the mean rating is very 
close to +1; this may be interpreted as a 
“favorability response set,” Taken by itself 
this interpretation suggests that the psy- 
chological indifference point probably falls 
near +1 and that —3 represents a rather 
extreme degree of disliking. On the other 
hand, it is possible that the mean rating of 
+1 is not the product of a mere response 
set (favorability), but results instead from 
Ss’ genuinely positive feelings toward most 
peers. Such attraetion would be of even 
greater magnitude than it appears to be if 
the evidence in Figure 4 is taken literally. 
In fact, if the truth of the balance theory is 
assumed, the evidence in Figure 4 indicates 
that the psychological indifference point 
falls close to —1, the pLo level at which 
f is zero, and that —3 represents only a 
small degree of disliking. These inferences 
can properly be checked only by further 
research with more highly refined scales, 
For example, a point of affective neutrality 
could be established on an improved pLo 
scale by using other information known to 
be associated with indifference; informa- 
tion on intensity, certainty, or confidence 


(Cantril, 1946; Katz, 1944; Stouffer, 1950; 
Suchman, 1950) or on response lateney 
(Osgood et al, 1957) could be used to 
establish empirically a point or region of 
psychological indifference. 

Interaction between Observations and 
Theory. In the two instances just dis- 
cussed—evaluating the results with respect 
to both the slope and the elevation of the 
curve—two main alternative explanations 
are available. The explanatory dilemma is 
such that either one must assume the truth 
of balance theory as one way to establish 
certain properties of the measurement 
seales, or one must make certain assump- 
tions about the measurement. scales to 
determine whether and in what respects 
balance theory has been confirmed. This 
explanatory dilemma is, in principle, com- 
mon to all scientific endeavor, and it is re- 
solvable only by what some have called cir- 
cular argument; for the careful investigator 
the result is involvement, not in a “vicious 
cirele,” but in a “spiral” of increasing con- 
fidence in both theory and measuring scales 
as each is successively refined in research. 
In the present instance, the strong version 
of the model should be checked more closely 
with measuring scales whose interval size 
and point of origin are more fully estab- 
lished. 

The above dilemma does not arise when 
the weak version of the quantitative balance 
model is evaluated, because the required 
assumption that the pLo scale has at least 
the properties of an ordinal scale (Stevens, 
1950) seems beyond controversy. 

Mode Differences. Another unexpected 
but highly reliable finding of Study I is 
that the judgments obtained under Mode 1 
are more highly correlated than are the 
“same” judgments under Mode 2. Detailed 
examination of the differences in the tasks 
required by the two modes yields at least 
two plausible ways to account for this 
fact. 

If the judgmental task of Part D is 
conceptualized as a two-stage process that 
requires (а) adopting a set (taking the role 
of another person) and (b) making judg- 
ments within the context of the set, and if 
it is assumed that the first process requires 
more information (or energy) —adopting a 


ы 
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set implies preparing to make а variety of 
individual judgments—then it can be shown 
that Mode 2 requires not only a greater 
number of acts (changes in psychological 
state) but also a greater number of the dif- 
ficult ones than does Mode 1. Under Mode 
1, S was involved in 3N +3 acts (3 changes 
in set and from within each, N judgments) ; 
each change in set is an act and there are, 
in addition, N acts in each of 3 sets. If the 
average number of others for whom Part D 
judgments were made is 15 (N — 15) then 
Mode 1 required 48 acts. Under Mode 2, S 
performed 2 X 3x N — 6N acts (3N set 
adoptions and from within each set, a single 
judgment). The specific task under Mode 2 
was to adopt a set, make a judgment within 
that set, adopt a second set, make another 
judgment, ete. until 3N sets had been 
adopted and a judgment made within the 
context of each. Thus if N = 15, Mode 2 
required 90 separate acts—nearly twice the 
number required in Mode 1. Mode 2 also re- 
quired 15 times as many of the presumably 
more demanding acts (adopting à set) than 
did Mode 1. If the more difficult task of 
Mode 2 results in less reliable judgments, 
the correlations for Mode 2 should be lower 
(closer to zero) than the correlations for 
Mode 1 subjects. Were it not for an overall 
“positivity effect" noted earlier, the pre- 
sumed greater difficulty of the task under 
Mode 2 (and consequent lower reliability) 
also would result in a curve with lesser 
slope for Mode 2, since both negative and 
positive correlations would be closer to 
zero. Obtaining judgments under both Mode 
1 and Mode 2 in а design that includes 
highly negative positions on the pLo scale 
would provide а test of this hypothesis: 
That is, the functions would cross if Mode 2 
judgments were less reliable than those of 
Mode 1. 

Another explanation of the Mode differ- 
ence derives from the difference between 


and seeing him as being liked (disliked). 
The task of Mode 1 makes it easy for 

to forget that he is to rate from another's 
point of view since, once he has adopted the 
set of another, 
minded of this orientation while he judges 
how that person feels toward each other 


person in the series. In other words, as the 
series of judgments are made under Mode 
1 the adopted orientation may become 
weaker (because it is not continually rein- 
stated) and its place gradually taken by S's 
own pervasive orientation. In contrast, only 
one judgment at à time is made from within 
the context of а given set under Mode 2; 
in this case S judges how much а given per- 
son is liked by each of a set of others. Each 
judgment S makes necessarily occurs im- 
mediately after he has adopted the orienta- 
tion of another person ; hence S's own orien- 
tation is less likely to intrude as the primary 
framework within which the ratings are 
made. In short, the conditions of Mode 1 
more easily allow S to lapse into assuming 
similarity (or into manifesting his own sin- 
gle ordering) than do the conditions of Mode 
2. Since increasing either of these disposi- 
tions (assuming similarity with others or 
manifesting one’s own ordering) would pro- 
duce more highly positive correlations 
among S's ratings, one would predict Mode 
1 correlations to be higher than those for 
Mode 2. 


Study II 

Only the weak version of the model was 
tested for Study II inasmuch as it seemed 
very unlikely that the pLs (self-esteem) 
scale possessed the required characteristics 
for a test of the strong version: That is, the 
results were evaluated only with respect to 
whether the obtained regression line has à 
slope greater than zero. 

Although the relations among congruency 
and the measures of self-esteem are not 
strong enough to enable efficient prediction 
of individual differences, they do reach high 
levels of statistical significance. That the 
correlations are not higher can be attributed 
in part to relatively undeveloped and in- 
adequately formulated instruments for as- 
sessing self-esteem. 

Coneretely, the findings of Study II meat 
that the higher à person’s self-esteem, thi 
more positive is the correlation between hi 
feclings toward a set of others and his pe! 
ception of their feelings toward him. Whil 
it is true that persons generally like tho: 
who they think like them and dislike tho: 
who they think dislike them, the main ir 
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port of Study II is that this generalization 
is most strongly manifested in persons with 
high self-esteem. 

If balance theory is taken seriously, it 
is possible to suggest one reason for the 
relatively low correlations between selí- 
esteem and congruency. The fact that the 
mean index of congruency is very high 
(řas = .74) ean be interpreted as evi- 
dence that, on the average, the sample Ss 
hold quite positive conceptions of them- 
selves. The rather marked restriction on 
the range of self-esteem, compared to the 
range of feelings toward other objects or 
persons, could be an important determinant 
of the low degree of measured correlation 
between self-esteem and congruency. 

The relation between self-esteem and 
congruency was found for both measures of 
self-esteem—SEI and BRF. However, the 
correlation with the latter index (that based 
upon observer's ratings) varies with sex. 
Apparently the behavior of а fifth-, sixth-, 
or seventh-grade boy, as it is observed by 
а teacher, is somewhat related to his feel- 
ings about himself and his perception of his 
relations with peers, whereas this does not 
hold for the sample of girls. 

Several related and rather general theo- 
retical: implications of the two studies 
warrant further comment. The first is the 
problenr.of defining the boundary condi- 
tions within which cognitive balance can be 
expected. A second problem concerns indi- 
vidual differences in what some authors call 
"tolerance for imbalance." A third involves 
the question of multiple routes to achieving 
balance (alternative methods of dissonance 
reduction) in the context of experimental 
studies of attitude change. 

The findings of both Study I and Study 
II elearly support one of the most important 
theoretical implications of the quantitative 
balance model—that the tendency toward 
balance is strongest when all perceived and/ 
or felt relations are maximally intense, For 
example, the correlation between pLo and 
oLq is maximally high only when p feels 
relatively strongly (intensely) about q. 
This is shown directly for the case of intense 
positive feelings by the data of Figure 4. 
Such a conclusion is also supported by a 
further study by the author (in prepara- 


tion) as well as one by Kanouse (1964). 
The latter studies show that very high levels 
of attitudinal consistency are exhibited jn 
attitudes toward controversial social issues 
and public personalities evoking intense 
feelings. For example, those extremely 
favorable toward Albert Schweitzer judge 
Schweitzer’s position as nearly identical to 
theirs on a wide variety of issues including 
abstract art, capital punishment, comic 
books, legalized gambling, movie censorship, 
racial segregation, and the United Nations. 
Conversely, persons extremely unfavorable 
toward Fidel Castro, for example, tend to 
judge his stands as nearly opposite theirs on 
the same set of issues. Positive correlations 
between S's own views and the assumed 
views of an admired person and negative 
correlations with the assumed views of the 
villain reach average values of .80 and —.30, 
respectively. Perhaps these correlations 
could be made even higher with an effective 
experimental manipulation of the intensity 
of S's attitude toward the persons or issues 
in question. 

The fact that the correlations are not 
typically +1.0 or —1.0 suggests that com- 
plete balance (i.e., balance among all ele- 
ments of an attitude cluster) is a very 
unusual state requiring special conditions, 
Exactly what these special conditions are 
is not yet clear, but the present formula- 
tion suggests that maximally polarized 
values (maximal distance from a zero or 
indifferent point) on all variables in the 
structure are required. Because these rather 
special conditions are rare in naturally oc- 
curring attitude structures (and may be 
difficult to produce experimentally) balance 
theorists have been justifiably cautious in 
speaking of balance as a “tendency” rather 
than a state that is unequivocably char- 
acteristic of an attitude cluster. The 
strength of the quantitative balance model 
in the face of these considerations is that 
it defines just those (perhaps unrealizable) 
conditions under which complete balance 
would be found and, more important em- 
pirieally, it predicts the degree of balance 
that will occur under any values of the 
relevant constituent variables (relations). 
Note that under some circumstances the 
predietion made by this model may be 
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simply, “there will be no correlation among 
the ratings.” While predicting the null 
hypothesis is generally uninteresting (and 
even undesirable), it becomes of consider- 
ably greater interest when it is placed in 
the context of a more general set of predic- 
tions (Suppes, 1964). In Study I, for ex- 
ample, such а prediction is simply а special 
ease of the general linear equation; the 
special case can be stated, “if pLo is zero 
then there will be no correlation between 
pLq and oLq.” Another example in which 
predicting the null hypothesis is à special 
case of а more general set of predictions 
might be: "if S's self-esteem is extremely 
low, there will be no relation between 
Tyra ora. and pLo" (ef. Figure 4). That is, 
the line of regression will not have à slope 
greater than zero. 

The findings of both Study I and Study 
II also may have implications for the prob- 
lem of specifying the basis of individual 
differences in what has been called “toler- 
“tolerance for 
Feather, 1964; 


that any vari- 
able reliably correlated with individual dif- 
ferences. in “tolerance i 
inconsistency” should be conceptualized as 
simply another relevant variable in the 
structure moderating the relationships 
among other response variables. This treat- 
ment places such additional variables sys- 
tematically within the framework of bal- 
ance theory. Study П 
strongly supports 
gion that the usual 
in the p-o-x triad require the assumption 
that p likes himself. Applying an analogous 
conclusion to the more complex structures 
of Study I suggests that Ss with а high 
degree of "tolerance for inconsistency” (SS, 
that is, whose attitude structures are 
poorly predicted by a balance theory that 
ignores individual differences) may have 
generally low levels of self-esteem. It is 
likely, also, that such Ss have 
reinforcement histories with respect to the 
verbal behavior in question. 
UTEM 

* Unpublished manuscript, 
a tolerance for dissonance?" 


1961, titled “1з there 


17 
these possibilities were true, then treating 
individual differences in self-esteem (or 
degree of socialization) as а fourth variable 
in the structure would significantly increase 
the proportion of variance accounted for 
within the framework of balance theory. 

Whatever their source, observed indi- 
vidual differences in tolerance for incon- 
sistency constitute а problem warranting 
further study. Kanouse's (1964) attempt to 
relate individual differences in achieved 
balance to several personality measures, a8 
well as some of the data on self-esteem in 
Study П constitute useful beginning ven- 
tures. Future research on this question 
should probably consider the trait dimen- 
sion Newcomb (1963, p. 385) has discussed 
as “gutism-realism.” This trait may be rele- 
vant to individual differences in tolerance 
for imbalance in view of the likelihood that 
the perception of interpersonal relations (or 
the perception of others’ attitudes in gen- 
eral) may reflect а compromise between 
complete cognitive consistency and the con- 
straints of i to this view, 
one with less than perfect balance 18 likely 
to be more ex- 
hibits completely balanced cognitive struc- 


the proposed quantitative model 
for the well-known obser- 
vation (Heider, 1958, pp. 207-209) that 
there are several ways to eliminate im- 
balance or inconsistency, once it has been 
produced. This issue is of greatest interest 
in the context of studies 
produce attitude change 
mental manipulation of variables. For ex- 
ample, Rosenberg and Abelson (1960, p. 
121) following in part an earlier paper of 
(1959), discuss essentially three 
ways of redressing imbalance: (a) changing 
(b) redefining or differ- 
entiating concepts, and (c) ceasing to think 
about the matter. Festinger’s (1957, p. 264) 
somewhat analogous set of three methods 0! 
reducing dissonance are: (a) changing th 
evaluation of one or more of the element 
involved in dissonant relations, 
new consonant cognitive elements, 
decreasing the importance of the elemen! 
involved in dissonant relations. The: wor 
of other investigators points to other rout 


tures. 
Finally, 
has implications 


by direct experi- 
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including, for example, changing degree of 
perceived choice (Cohen, 1960), "project- 
ing" undesirable traits onto a person simi- 
lar to oneself ( Bramel, 1962), and being in- 
credulous (Osgood & Tannenbaum, 1955) 
or not trusting the communicator (Hov- 
land & Weiss, 1951). These are merely a 
few of the numerous possibilities; in fact, 
there should exist at least as many methods 
of reducing imbalance as there are relevant 
variables in the cognitive structure. In other 
words, each dimension in the quantitative 
balance model defines one variable that 
is а potential candidate for change when 
i is introduced into the system. 
Which variable will actually change or 
whether all variables will change slightly 
probably depends upon a number of factors, 
including the amount of effort required to 
effect change on a given variable (Rosen- 
berg & Abelson, 1960, p. 133). In practice, 
investigators have usually tried to study 
change in one variable at a time, and they 
do so by attempting to block alternative 
routes to reducing imbalance. For example, 
to produce attitude change as a function of 
а persuasive communication, it is neces- 
sary to ensure that S perceives the com- 
municator as trustworthy (Hovland & 
Weiss, 1951) and the issue as important 
(McGuire, 1960a, 1960b, 1960c), ete. Simi- 
larly, in a choice experiment, to produce 
increased attractiveness of a chosen alter- 
native and decreased attractiveness of un- 
chosen alternatives, it is necessary that S 
feel he had a “free choice” (Cohen, 1960) 
and that S not feel the whole matter is 
really unimportant (Festinger, 1957), etc. 
At this juncture the quantitative balance 
model is useful in providing a rationale 
for the systematic interpretation of the 
variety of possible methods to reduce dis- 
sonance or eliminate imbalance. When more 
sophisticated measurement of these dif- 
ferent variables and their resistance to 
change becomes available it should be pos- 
sible to test more precise predictions. For 
example, given a pattern of experimenter- 
produced changes in one or more inde- 
pendent variables, the model should predict 
which variables are likely to change and 
how much change is expected on each. 


Summary 


The investigation addressed the problem 
of measuring the degree of consistency or 
balance of cognitive structures for the 
general ease in which the relations among 
elements in the structures are treated as 
continua rather than simple dichotomies. 
The quantitative extension of Heider’s 
theory of cognitive balance offered here 
enables systematic treatment of cognitive 
structures whose elements have any degree 
of similarity or attraction to one another. 

The extended balance model was tested 
in two behavior domains: perceived inter- 
personal attraction among variously evalu- 
ated peers (Study I) and perceived recipro- 
cation of liking by others toward the self 
under various levels of self-esteem (Study 
II). Both studies were conducted with 14 
groups of boys and 14 groups of girls in 
School Grades five, six, and seven. Each of 
the 415 children in the sample provided the 
following data: self-ratings on a self-esteem 
inventory, a series of sociometric-like judg- 
ments indicating degree of liking for all 
classmates, perception of the extent to which 
he is liked by each peer, and perception of 
the liking relations among selected class- 
mates. The last named judgments were 
obtained via two modes with a given 8 
performing under only one mode. Mode 1 
required judgments about how three focal 
persons (a well-liked, a neutral, and a dis- 
liked classmate) felt toward the remaining 
classmates, and Mode 2 required judgments 
about how each other person in the class 
felt toward the three focal persons. 
Teacher's ratings of pupil behavior related 
to self-esteem were also obtained. 

Implications regarding intercorrelations 
among particular sets of S's judgments were 
deduced from the quantified version of 
Heider’s theory. In Study I the theory 
predieted a gradient in the degree of cor- 
relation between S's own evaluative rat- 
ings of a set of others and the ratings he 
perceived radiating from or converging upon 
the three focal persons, the gradient being a 
function of S's attraction to each focal per- 
son. For Study II the theory predicted that 
the degree of correspondence between 578 
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sly with 5% level of self-esteem. cused 
p predictions were confirmed at very 
high levels of statistical significance for Study 
whieh 
pre- 


i: 
і 


Both studies. 


Study I revealed that the perceived toward others are reciprocated varies posie 
ference relations among others are char- tively with his level of self-esteem. This 
‘acterized by а high degree of balance as relation is the same for both sexes when 4 

алена self- 


E by the quantitative version of self-rating device i» used to 
 Heider's theory but that the degree of bal- — esteem. 

ance is less than perfect; this result. is і 
cussed as possibly reflecting the moderating ings for several kinds of 
effects of other relevant variables. Corre- аге discussed. The model 
[ations among sets of judgments were alo more highly differentiated 
generally more positive than predicted by it contributes to the possibility 
the theory—a result consistent with the ing better measuring scales. It 
- presumed operation of a “positivity tend- that the quantitative 
ency.” No difference the sexes ap- theory and the method 
peared, but à striking difference was found the theory are 

| between the judgmental modes; the intra- study properties of 
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EFFECT OF ELECTROCUTANEOUS DIGITAL STIMULATION 
ON THE DETECT ION OF SINGLE AND DOUBLE 
FLASHES OF LIGHT' 


STANLEY NOVAK? 
Columbia University 


2 brief flashes separated by a dark interval were presented successively to 

the same foveal locus of O's dark-adapted right eye- Stimulus values were 

chosen to obtain а report of 2 events 80% of the time. A marked decrease 
in temporal resolution occurred when à brief shock was delivered to O's 
ipsilateral hand 25 msec. before either of the flashes. In addition, а similarly 
presented brief shock was found to lower the luminance required for abso- 

{ lute threshold of a single flash. Based on this finding, а brightness enhance- 

| ment “masking” hypothesis was advanced to explain the effect of the shock 
on the temporal resolution of the flashes. Data from subsequent experiments 
were not consistent with this hypothesis. Additional experiments demon- 
strated that the amount of reduction in the temporal resolution of the 
flashes was not & function of shock intensity. Other directions of explana- 
tion based on signal detection theory and heurophysiological “alerting” 
data are examined and are also found inadequate to encompass the present 
data. 
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SCHEMATIC VIEW OF APPARATUS 


Fig. 1. Schematic view of Optical System A and associated circuitry. A, eye; B, viewing lens; C, 


Pellicle beamsplitter; D, No. 26 wratten filter; E, neutral density filter; F, fixation-point field stop; G 
diffusion dise; H, fixation-point light source; I, field stop; J, variable density polaroid; K, fixed neutral 


, 


density filters; L, collimating lens; M, ultraviolet; source; N, ultraviolet filter; O, glow modulator tube; 
P, photomultiplier tube; Q, ammeter; В, glow modulator gate; S, timing circuitry; T, constant current 
DC stimulator; U, EEG disc electrodes; V, oscilloscope. 


tion of the light flashes, (b) а constant cur- 
rent source for electrocutaneous stimula- 
tion, (c) the timing circuitry associated 
with the above units, and (d) the devices 
which allowed monitoring of the temporal 
relationships and waveforms of the light 
flashes and electrocutaneous stimulus. 

Optical System. Optical System A was a 
conventional monocular Maxwellian view 
system used to present successive flashes of 
equal luminance (Figure 1). 

A glow modulator (Sylvania R/1131C) 
operated at a constant current of 25 milliam- 
peres was used as a light source, The source 
was activated by a 350-volt DC pulse 
generated by associated timing and gating 
circuitry. As shown in Figure 1, the light 
output of the glow modulator (O) initially 


passed through a collimating lens (L) having 
a focal length of 76.2 millimeters. The in- 
tensity of the collimated beam was then 
discretely and continuously varied by means 
of fixed neutral density filters (K) and by a 
variable density polaroid (J). The colli- 
mated beam then passed through a circular 
aperture having a diameter of 2.54 milli- 
meters drilled in a field stop plate (I). The 
field stop was placed at the focal length 
(184.1 millimeters) of Viewing Lens B and 
provided the stimulus field. The collimated 
beam, passing through the circular aperture, 
was then focused on the observer’s (0%) eye 
by means of Lens B. 

A Pellicle beamsplitter (National Photo- 
color Corporation) positioned at a 45° angle 
to the path of the collimated beam, was 
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placed between the field stop and Lens B 
(C). Adjacent to the beamsplitter, and at a 
90° angle to the path of the collimated 
beam, was а second light system which 
provided illumination for the fixation point. 
The light source (H) used was a General 
Electric tungsten No. 47 pilot lamp powered 
by a 6.3-volt regulated DC supply. The 
output of this source first, passed through a 
white plastic diffusing material (G) and then 
through a circular aperture 1.09 millimeters 
in diameter drilled in a field stop plate (F) 
placed at the focal length of Lens B. 

The light emerging from the fixation 
point field stop was then reduced in in- 
tensity by a Kodak Neutral Density 3.00 
filter (E). A Kodak No. 26 filter (D) was 
also interposed and limited the transmission 
to the red end of the spectrum above 
millimicrons. The dim red fixation beam was 
then reflected 90° by the Pellicle and super- 
imposed upon the glow modulator beam. 
'Thus, with the above system, a just supra- 
threshold red fixation point subtending а 
visual angle of 0.33* was superimposed upon 
the center of à cireular stimulus field sub- 


vated with striking pulses of brief duration, 
unreliability of operation often occurs. А 
technique similar to that originaly de- 
veloped by Matin (1964) was used to in- 
crease the ionization of the gasses within 
the tube and improve the reliability of 
operation. In the present study, the optical 
system contained а General Electric Аг-4 
argon ultraviolet. lamp (M) used to ir- 
radiate the glow modulator (Figure 1). 
The ultraviolet source was powe by a 
135-volt DC regulated supply and drew 2 
milliamperes of current. As shown in Figure 
1, the output of this source was P 
through a Corning No. 9863 (color spec. 
7-54) filter i visible 
radiation above 410 millimicrons from 
reaching O's eye. 

For certain experiments, Optical System A 
was modified to allow presentation of two 
successive flashes luminance, 
and the new system was designated Optical 
System B (Figure 2). , 

The modification involved the conversion 
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of the single-beam Maxwellian view system 
to а dual-beam system. This was achi 

by mounting а duplicate light channel at 
right angles to the existing collimated beam. 
This new light channel was mounted in 
place of the previously described fixation ` 
system, and a new type of fixation system 
was used. In this position, the new colli- 
mated beam was reflected 90° by the beam- 
splitter. (C) and superimposed upon the 
existing collimated beam. The superimposed 
beams were then focused on O's cornea by 
passing through Lens B. А field stop aper- 
ture (Ga) identical in size with the one 
previously used in Optical System A was 
used in the additional light channel. 

The modification of Optical System A 
made necessary the construction of a differ- 
ent fixation point system. A dim red fixation 
point, identical in size and appearance with 
that used in Optical System A, was super- 
imposed upon the center of the stimulus 
field. This was achieved by the use of an 
edge-lighted transparent lucite strip (F) 
placed in front of one of the field stops (GJ). 
When so positioned, one edge of the lucite 
strip protruded through the side of the 
housing of the optical system into a lamp- 
house containing the necessary illumination. 


filtering used in 
used to edge-light the lucite strip. 

Optical System A and its later modifica- 
tion were contained in а wooden housing 
attached to the outside wall of а light-tight 
booth. A 95 A-millimeter-diameter metal 
tube passing through this wall allowed O to 
view the stimuli from within the booth. A 
“bite-board” arrangement was used within 
the booth to assure constancy of head 


mulation. The electro- 
cutaneous stimulus was à 1-millisecond DC 
ulse. The current for this 
pulse was generated by а high-impedance- 
output stimulator (T) which supplied а 
constant current despite fluctuations in O's 
skin resistance (Figure 1). The intensity 
of electrocutancous stimulation was dis- 
cretely variable in 10-microampere steps. 
The duration of the electrocutaneous pulse 
was determined by a timer (S) which elec- 
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Fic. 2. Optical System B. A, eye; B, viewing lens; C, Pellicle beamsplitter; D, fixation-point light 
source; E, No. 26 wratten filter; F, lucite strip providing fixation point; G1 Gs , field stops; H, variable 
density polaroid; T; Is , fixed neutral density filters; J1J; , collimating lenses; K; K; , ultraviolet sources; 
Lı Lz, ultraviolet filters; M; Ms ; glow modulator tubes; №, N; , photomultiplier tubes. 


tronieally gated the stimulator for the de- 
sired duration. The electrodes used were 
Standard Mederaft 9-millimeter-diameter 
EEG dise electrodes (U). Attachment of 
the electrodes is described in the Method 
section in Experiment A. 

Timing and Gating Circuitry. Temporal 
control of the stimuli was achieved by the 
use of six timers (S) which could be cascaded 
to produce almost any desired sequence of 
events. Each timer could generate pulses 
variable in duration from 1 millisecond to 
1,200 seconds. As previously mentioned, 
one of the timers was used to directly gate 
the DC stimulator to produce an electro- 
cutaneous stimulus of desired duration. In 
the case of the glow modulator, the timer 
delivered a pulse of desired duration to 
another gating device (R). This intermediary 
gating circuit in turn generated a pulse of 


the same duration but having the voltage 
requirements to operate the glow modulator. 
The timing units were designed to operate 
within 2% accuracy. 

Monitoring System. The light output of 
each glow modulator was monitored by an 
RCA 931 A photomultiplier tube (P) ad- 
jacent to it. The photomultiplier output was 
displayed on the upper beam of a Tektronix 
532 oscilloscope (V). 

No attempt was made to directly monitor 
the output of the constant-eurrent stimula- 
tor during the course of the experiment; 
however, the timing pulse which gated the 
stimulator was displayed on the lower beam 
of the oscilloseope. With this instrumenta- 
tion, constant monitoring of the temporal 
relationship of the visual and electrocu- 
taneous stimulation was achieved. 

Calibration. Spectral differences between 
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matching fields made calibration with а 
Macbeth illuminometer difficult. It was 
decided to substitute a matched tungsten 
source for the glow modulator in order to 
permit calibration with the illuminometer. 
The glow modulator was operated in steady 
state and its crater imaged on the surface 
of a photovoltaic cell (General Electric PV-1) 
placed at the focal length of Viewing Lens B. 
The cell was corrected for the spectral 
sensitivity of the eye and also contained а 
shield over the photosensitive strip having 
a circular aperture identical to the size of the 
crater image. The output of the photocell 
when thus illuminated was from a 
Leeds-Northrop galvanometer connected to 
the cell. Using original components, the 
main channel of the optical system was 
duplicated on a bar photometer and a stand- 
ard tungsten lamp substituted for the glow 
modulator. The standard lamp was opera 
at a color temperature of 2,850°K. The light 
was focused through a condensing lens onto 
a frosted glass dise. The transmitted light 
was then focused through the optical system 
so that its image filled the aperture in the 
photocell shield. The luminance was adjusted 
to give the desired reading оп the galvanome- 
ter, and the entire photocell assembly was 
then removed from the optical system and 
replaced with à Maebeth illuminometer. 
The luminance value of the glow ula- 
be 10,260 millilamberts. 
source as а reference, 


the second glow modulator in Optical Sys- 


tem B was calibrated using the photocell 
and galvanometer. he luminance value 0 
this source was 8j illilamberts. These 


figures were then corrected to take into 
account the decrease inlight output occurring 
when the glow 
pulsed devices. 


conditions and experimental design. The 
method and results of the experiments are 
discussed in detail below. 


the bar photometer was 
Laboratories, 
York 


3 Calibration using 
performed by Electrical Testing 
Incorporated, 2 East End Avenue, 
City. 
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Observers. Five Os were used over the 
course of preliminary experimentation. Two 
male Os, SN and RF, were used for the en- 
tire series of final experiments. Both were 27 
and were graduate students in 
psychology at Columbia Uni- 

n vision, while 
RF wore corrective lenses during the experi- 
mental sessions, Extensive preliminary train- 
ing was given for each of the experimental 
procedures described below. 


ExPERIMENT À. EFFECT or EÉLECTROCU- 
TANEOUS DIGITAL STIMULATION ON THE 
PROBABILITY OF DISCRIMINATING 
Two SUCCESSIVE FLASHES 
or LIGHT 


Method ' 


Stimulus-Response Conditions. Two flashes (її 
and fs) separated 


duration, —0.20 log millilambert in luminance, 
and subtended 0.80° 
flash sequence was termed the standard flash-pair. 

In the majority of experiments in 
gation, the probability of reporting two flashes 
(Рз) was the dependent variable. In this, and in 
subsequent experiments using P. as а measure, 
О indicated by tapping whether “опе” or "two" 
flashes were seen. Р; was equal to the number of 
times О reported seeing two flashes divided by the 
total number of presentations under a given 
experimental condition. P, refers to the mean 
probability determined from а number of experi- 
mental sessions for а given set of experimental 
conditions. 

During preliminary experimentation, the stand- 
ard flash-pair was found to yield an approximate 
P, of 0.800 for each of the Os. This was determined 
by consecutive presentation of the standard flash- 

ir every 15 to 20 seconds and after 5 minutes of 
dark adaptation. Based on 


electrodes, 
5 millimeters 
skin crease made 
and distal phalanges. The 
to the digit with 
technique, developed by Schmid (1961), results 
in а minimum of discomfort due to the pulsation 
of blood vessels, perspiration, etc. The polarity 
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of electrode placement was held constant, the 
electrode nearest the fingertip always having a 
positive polarity. 

Absolute electrocutaneous thresholds were 
determined for each О using a modified method 
of limits (ascending series only). Twice the value 
of O’s absolute electrocutaneous threshold in- 
tensity was used as the standard value of stimulus 
intensity over the course of the experiments. 
SN and RF received 1.70 milliamperes and 1.20 
milliamperes, respectively. Tt should be noted that 
these values of stimulation were not reported as 
aversive, and no reflex twitches of the finger 
muscles were visually observed upon presentation 
of the pulse. 

In the present experiment, the time between 
electrocutaneous and visual stimulation was the 
independent variable. Simultaneity of onset be- 
tween the electrocutaneous stimulus and the first 
flash (f1) in the standard flash-pair was designated 
to . The temporal difference (A,) between the onset 
of the electrocutaneous stimulus and tha onset of 
the first flash was expressed relative to to. A 
negative sign was used to indicate that the onset 
of the electrocutaneous stimulus came before the 
onset of the first flash. A positive sign indicated 
that the onset of the electrocutaneous stimulus 
came after the onset of the first flash. For con- 
venience, an electrocutaneous stimulus presented 
at some negative value of A, is said to have oc- 
curred in “negative time." Conversely, an elec- 
troeutaneous stimulus presented at some positive 
value of At is said to have occurred in “positive 
time." 

Experimental Design. Certain control considera- 
tions affected the design of the experimental pro- 
cedure. During preliminary experimentation, two 
types of presentations were used in a constant 
stimulus design. One type was a so-called experi- 
mental “shock” presentation in which the elec- 
troeutaneous stimulus aceompanied the standard 
flash-pair at some value of A; . The other type was 
a control *nonshock" presentation in which the 
standard flash-pair was presented alone. The ex- 
perimental session consisted of 30 nonshock pres- 
entations randomized with 30 shock presentations. 
The 30 nonshock presentations allowed monitoring 
of the O's Р level during the session. The P; value 
computed from the 30 nonshock presentations 
was termed the Intrasession control Ps . However, 
calculation of the Intrasession control Р, for the 
experimental situation yielded a value of approxi- 
mately 0.600 for each O. This value was substan- 
tially lower than the P» of 0.800 originally deter- 
mined by consecutive presentation of the standard 
flash-pairs alone. The difference between the P; 
level determined from the consecutive presenta- 
tion of the nonshock standard flash-pairs and the 
Р, level determined from the standard flash-pairs 
when randomized among shock presentations dur- 
ing the session was termed the Intrasession shift. 
After experimenting with a number of alterna- 
tives, the Intrasession shift was finally eradicated 
by the addition of 40 randomly presented nonshock 
standard flash-pairs to the experimental session. 


Tn light of this preliminary finding, a number of | 
control standard flash-pairs were presented con. 
secutively at the beginning of each experimental 
session. The P; determined from these presenta. 
tions was termed the Presession control Р,. 

O's electrocutaneous threshold was also de. 
termined before, in the middle of, and after each 
experiment by a modified method of limits. These 
measures were termed the Presession, Midsession, 
and Postsession electroeutaneous thresholds, re- 
spectively. 

The final procedure may be summarized as 
follows: Attachment of the electrodes followed by 
5 minutes of dark adaptation; Determination 
of the Presession control P; by presentation of 30 
nonshock standard flash-pairs; Determination of 
the Presession electrocutaneous threshold; Presen- 
tation of 100 control nonshock standard flash-pairs * 
randomized with 60 experimental-shock standard 
flash-pairs; Final determination of Postsession 
electrocutaneous threshold. A period of from 15 
to 20 seconds was allowed between successive 
presentations. A 3-minute rest period was given 
during the middle of the session. During this time, 

a Midsession electrocutaneous threshold determi- 
nation was made. 

Five values of A; (12 shock standard flash-pairs 
per value) were investigated in any one session. 
Data from a group of eight sessions using the same 
A, values but different random presentation orders 
were obtained. Eight such sessions were termed а уа 
group and yielded a P» for each value of A; based 
on 96 presentations. A total of 25 A, values were 
investigated for each O over the course of 40 
experimental sessions. B. 

The Presession control Р, the Intrasession 
control P, , and their respective standard devia- 
tions were computed for each O based on the data 
from the 40 sessions. In addition, the Intrasession 
control Р» was systematically analyzed over the 
course of the experimental session for each O. 
This was achieved by counting off five blocks of 20 
successive Intrasession control standard flash- 
pairs over the course of the session, An Intra- 
session P, and standard deviation were determined 
for each successive block based on the 40 sessions. 
For the shock presentations, a P, and standard 
deviation were computed for each given A; value 


_ from the eight sessions in each group. 


Based on the 40 sessions, mean Presession, 
Midsession, and  Postsession electrocutaneous 
thresholds and standard deviations were also 
computed for each O. 


Results 


As shown in Figure 3, P, initially de- 
creased from the Intrasession control level 
(0.775 for SN, 0.842 for RF) as А, was 
varied from —200 to —25 milliseconds. At 
—25 milliseconds, P achieved a minimum 7 
value of 0.114 for SN and 0.146 for RF. 
After reaching this initial minimum, Р» in- 
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criminating two successive flashes of light (fi and fs) 


as a function of the time between electrocutaneous and visual stimulation. Vertical bars indicate +1 


standard deviation. 


a Fra. 3. Experiment A. The probability of dis 


creased and reached а maximum value of that in the vicinity of these minima, A: was 
0.573 for SN and 0.583 for RF at to A resolved in 5-millisecond steps. 

second decrease in Р» then followed as As Control Data. The first of the controls 
was varied from to to +60 milliseconds. At allowed а comparison between the Presession 
4-60 milliseconds, 4 second minimum Pawas and Intrasession P, levels. The Presession 
‘ee observed and the P, values obtained were control P» was 0.792 for SN and 0.829 for 


0.167 for SN and 0.135 for RF. After reach- RE, and the Intrasession control P; level 
ing this second minimum, P, increased and was 0.775 for SN and 0.842 for RF. A t test 
returned to the Intrasession control level at between the control measures revealed no 
+240 milliseconds for both Os. Hence the significant differences for either 0 (p > 05). 
1 d of the control measures moni- 
ceessive tored the Intrasession control Р» level for 


was a marked reduction in the The secon 
successive blocks of 20 presentations 0C- 


probability of discriminating two su 
flashes of light when the electrocutaneous 

* stimulus was presented 25 milliseconds be- curring over the course of the experimental 
fore either of the flashes. It should be noted session. 
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DATA FROM THIRTY 
CONSECUTIVE FLASH- 
PAIRS PRESENTED BEFORE 
EXPERIMENTAL S 

Fic. 4. Experiment A. The probability of dis- 
criminating two successive flashes of light before 
and during the experimental session without ac- 
cessory electrocutaneous stimulation. Vertical 
bars indicate 2-1 standard deviation. 


DATA FROM RANDOMIZED FLASH- 
PAIRS OVER COURSE OF EXPERIMENTAL 
SESSION 


Аз seen in Figure 4, no systematie changes 
occurred over the course of the session. 
Р» ranged from 0.764 to 0.792 for SN, and 
from 0.838 to 0.856 for RF. The standard 
deviations ranged from 0.063 to 0.096 for 
SN, and from 0.057 to 0.076 for RF. The 
Presession control data are shown at the 
left of Figure 4 for reference. 

The third control incorporated in the de- 
sign consisted of measuring O's absolute 
electrocutaneous threshold before, in the 
middle of, and after the experimental session. 

"These mean Presession, Midsession, and 
Postsession electrocutaneous control values 
were 0.63, 0.63, and 0.63 milliampere for 
SN, and 0.57, 0.57, and 0.58 milliampere for 
RF, respectively. The respective standard 
deviations for these values were 0.05, 0.05, 
and 0.06 milliampere for SN and 0.04, 0.04, 
and 0.04 milliampere for RF. An analysis of 
variance between the control measures re- 
vealed no significant differences for either O 
(p > .05). 

As previously indicated, absolute electro- 
cutaneous thresholds were determined at 
the start of the investigation. These values, 
determined in a single session, were 0.85 
milliampere for SN and 0.60 milliampere for 
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RF. However, the final Presession electro- 
eutaneous thresholds were based on the 
entire 40 sessions conducted over a period of 
several months. These values, as indicated 
above, are lower than those previously de- 
termined in the single session. The electro- 
eutaneous thresholds revealed a marked 
decrease over the first eight sessions, espe- 
cially for SN. Thus, instead of the ideal twice- 
threshold value of stimulating current used 
for both subjects, the actual values turned 
out to be 2.7 and 2.1 times the absolute 
electrocutaneous threshold intensity for SN 
RF, respectively. 


ExrERIMENT В. Errecr ОЕ ELECTROCU- 
TANEOUS STIMULATION ON THE VISUAL 
THRESHOLD FOR A SINGLE FLASH 
or LIGHT 


Method 


Stimulus-Response Conditions. The time be- 
tween eleetroeutaneous and visual stimulation 
(Ax) was again the independent variable. A single 
flash of light was used in place of the standard 
flash-pair, and the threshold luminance for the 
detection of this single flash was obtained as the 
dependent variable. The single flash used was 
identical with either of the flashes in the standard 
flash-pair with the exception that its luminance 
was varied in discrete steps of approximately 
0.03 log millilambert by means of the variable 
density filter. Optical System A was used for stim- 
ulus presentation. 

Electrocutaneous stimulus intensities and dura- 
tion were identical with those used in Experiment 
A. Each O was given a dark-adaptation period of 
8 minutes before each session. A period of from 10 
to 15 seconds was allowed between each stimulus 
presentation. A “‘self-stimulation’’ technique 
was used in this study in which O, after a signal 
from the experimenter, presented himself with the 
stimuli by depressing à button. This technique 
was used to reduce response variability due to 
improper fixation oceurring just before stimula- 
tion. After each stimulus presentation, O indicated 
by tapping whether or not he saw the flash. 

Experimental Design. A modified method of 
limits design was used which utilized only ascend- 
ing luminance series. This was done to lessen any 
cumulative light adaptation occurring as a result 
of suprathreshold stimulation. Two types of 
ascending series were used: A control series which 
determined absolute threshold for the flash alone, 
and an experimental series which determined the 
absolute threshold for the flash accompanied by 
the electrocutaneous stimulus at some constant 


value of As. Presession and Intrasession control . 


measures were obtained. A Presession control 
threshold for the flash alone was determined by 
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four consecutive ascending series presented prior trol series is designated as A, the Intrasession 
to the experimental session proper. Each series control series as B, and the A, series вз С (sub- 
was started from а different subthreshold lumi- scripts 1, 2, 3, and 4 indicating four different 2. 
nance which was then increased in discrete steps values investigated in a given session), then the 
until suprathreshold. Threshold was defined as the presentation order of a typical experiment may be 
midpoint of the luminance interval occurring be- illustrated as follows: 

fore two successive detections. Four specific A AAAA C,BC,C,BC; C;BC;C; BC, C,BC,C; BC, 
values were investigated in each experimental A group of six sessions for each set of four м 
session, three ascending series being used for a values was obtained for each O. The presentation 
given value. Ап Intrasession control threshold order of the A ascending series was rando 

was obtained by the inclusion of six ascending in each session, Five groups of six sessions were 
series using the flash alone. obtained, yielding а total of 20 4, data points рег 

As a result of preliminary experimentation, it 4 

was found that randomizing the presentation order In order to ascertain if systematic shifts in 
of the A; series among & fixed presentation order of visual threshold were occurring over the course 
Intrasession control series yielded minimum of the experimental session, the data were ana- 
variability. In addition, this technique yielded a lyzed as follows: A mean threshold luminance 
^ minimal difference between Presession and Intra- value and standard deviation were computed for 
session control thresholds. If the Presession con- each of the four Presession control series and each 
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Fic. 5. Experiment В. Threshold Juminance required for the detection о 1 of. 
м as TSR ч of the time between electrocutaneous and visual stimulation. Vertical bars indicate +1 


standard deviation. 
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of the Intrasession control series based on the 30 
sessions for each O. Since these control series 
occurred at approximately the same time during 
the experimental session, they provided a useful 
indicator of possible threshold shifts. 

As previously mentioned, three threshold de- 
terminations for a given value of A, were obtained 
in any single session. These three threshold deter- 
minations were then averaged to provide a single 
threshold valve for the given A, value in the ses- 
sion. An overall mean threshold and standard de- 
viation for the given A, value based on six such 
sessions were then computed and used as the final 
threshold data. 


Results 


As shown in Figure 5, threshold luminance 
decreased from the control level (—0.9747 
and —0.9914 log millilambert for SN and 
RF, respectively) as A, was varied from 
—190 to —25 milliseconds. A minimal 
threshold value was observed at —25 milli- 
seconds for both Os. These minimal thresh- 
old values were —1.0862 and — 1.1135 log 
millilamberts for SN and RF, respectively. 
Threshold luminance then increased and 
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(mL) 
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DATA FROM FOUR CONTROL DATA FROM SI 
X INTERPOLATED 
SERIES CONSECUTIVELY CONTROL SERIES PRESENTED 


PRESENTED BEFORE THE DURING TH 
EXPERIMENTAL SESSION SESSION амата 


Fig. 6. Experiment B. Threshold luminance 
required for the detection of a single flash of light 
before and during the experimental session de- 
termined from control series not containing elec- 
trocutaneous stimulation. Vertical bars indicate 
+1 standard deviation. 


returned to the control level as A, was varied 
from —25 to approximately +150 milli. 
seconds. The main finding, therefore, was 
the occurrence of a maximal decrease of 
approximately 0.10 log millilambert in ab- 
solute threshold luminance for each O when 
the electrocutaneous stimulus preceded the 
test flash by 25 milliseconds. 

Control Data. Figure ‘6 shows the mean 
threshold luminance and standard deviations 
for the four ascending Presession control 
series and the six ascending Intrasession 
control series. The mean threshold lumi- 
nances (in millilamberts) for the Presession 
series ranged from 0.105 to 0.107 for SN and * 
from 0.100 to 0.101 for RF. The standard 
deviations of these control values ranged 
from 0.009 to 0.010 for SN and from 0.010 
to 0.012 for RF. The mean threshold lumi- 
nances for the Intrasession series ranged 
from 0.105 to 0.107 for SN and from 0.100 
to 0.103 for RF. The standard deviations of 
these values ranged from 0.008 to 0.011 for 
SN and from 0.010 to 0.013 for RF.. 

As seen in Figure 6, no consistent change 
in visual threshold occurred before or during » 
the course of the experimental session. An 
analysis of variance of the four Presession 
control means and the six Intrasession con- 
trol means yielded no significant differences 
for either O (p > .05). 


Exrrerment C. THE PROBABILITY or Drs- 
CRIMINATING Two SUCCESSIVE FLASHES 


or LIGHT AS А FUNCTION or IN- » 


CREASING THE LUMINANCE OF 
First FLASH IN THE STANDARD 
FLASH-PAIR 


Method 


Stimulus-Response Conditions. The present ex- 
periment was concerned with varying the lumi- 
nance of the first flash in the standard flash-pair 


without accessory electrocutaneous stimulation. ў 


Р» was acquired as a function of inereasing the 
luminance of the first flash over a range of 0.39 
log millilambert above its usual value of —0.20 
log millilambert. The discrete luminance values 
used to cover this range were —0.20, —0.10, —0.02, 
+0.11, and +0.19 log millilambert. The second 
flash and the dark interval remained at their usual 
values of —0.20 log millilambert and 85 milli- 
seconds, respectively. Optical System В was used ,. 
for visual stimulation. Five minutes of dark- 
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adaptation were given before the experimental 
session. А period of from 10 to 15 seconds was 
allowed between presentation of each flash-pair. 
Experimental Design. The method of constant 
stimuli was used. An experimental session Con- 
sisted of the initial determination of a Presession 
control P: by the consecutive presentation of 20 
standard flash-pairs. This was followed by the 
presentation of 100 flash-pairs randomly contain- 
ing the different values'of luminance of the first 
flash. Five different luminance values were used, 
there being 20 presentations of each value. One of 
the luminance normally used in 
(—0.20 log millilambert). 
This specific flash-pair yielded, therefore, ап 
value which could then be 
compared with the other control measures used 


sessions using 
the same luminance values was obtained, resulting 
ina P» data point for each luminance level based 
upon a total of 100 presentations. 

Presession, Intrasession, and Postsession con- 
trol P» values were determined for each session 
and the means and standard deviations determined 
for the five sessions. 


Results PT 


As shown in Figure 7, inereasing the lumi- 
nance of the first flash in the standard 
flash-pair resulted in a systematic decrease 
in Ps for both Os. Ps decreased from 0.840 


-0.20 -0.10 0.00 +0.10 +0.20 


LUMINANCE OF f, (log mL) 


Fig. 7. Experiment C. The probability of dis- 
criminating two successive flashes of light as а 


function of increasing the luminance of the first 


Y fash in the standard flash-pair. 
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to 0.110 for SN and from 0.850 to 0.140 for 
RF with the addition of 0.39 log millilam- 
bert to the first flash. 

Control Data. The Presession, Intrasession 
and Postsession P, control values for each 
О were 0.820, 0.840, and 0.810 for SN, and 
0.790, 0.850 and 0.790 for RF, respectively. 
The respective standard deviations of these 
values were 0.051, 0.092, and 0.037 for SN, 
0.037 for RF. An 
between the control 
significant. differences 


analysis of variance 
measures revealed no 
for either О (p > 05). 


ExPERIMENT D. THE Pnosanmry or Dis- 
CRIMINATING TWO SUCCESSIVE FLASHES 
or LIGHT A8 A FUNCTION OF DARK 
INTERVAL FOR A Given CONDI- 
TION OF ELECTROCUTANEOUS 
STIMULATION 


Method 


Stimulus-Response Conditions. In the present 
. was obtained as а function o! 
increasing the dark interval in the standard flash- 
when accompanied by the electrocutaneous 
A, value of —25 milli- 
seconds. Electrocutaneous intensities and dura- 
tion were identical with those used in Experiment 
It was found in Experiment A that an electro- 
cutaneous stimulus accompanying the standard 
flash-pair at а A 


used to produce the standard flash-pair: 
That is, each of the flashes was generated by two 
superimposed by à 
more exact compari- 
son with the next experiment in which a differen- 
tial luminance flash-pair was used. Each O was 
given 5 minutes of dark adaptation prior to the 
experimental session. A period of from 10 to 15 
seconds was allowed between successive stimulus 
presentations. Р 
Experimental Design. A constant stimulus 
method similar to that used in Experiment C was 
used. A Presession control P» was initially de- 
termined by the consecutive presentation of 20 
standard flash-pairs aceompanied by electrocu- 
taneous stimulation maintained at а constant 
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A of —25 milliseconds. This was followed by the 
presentation of 100 flash-pairs accompanied by 
electrocutaneous stimulation at a constant A, 
value of —25 milliseconds but having various 
values of dark interval. In each session, all of 
the five values of dark interval indicated above 
were used, there being 20 random presentations 
of each value, One of the dark intervals used (85 
milliseconds) was the same as that used in the 
Presession control presentations. This allowed the 
determination of an Intrasession control P; 
which could be compared with the other control 
values. A Postsession control Р, was then de- 
termined by the same method used to determine 
the Presession control Р, . A total of five sessions 
for each O was obtained. This resulted in a dark 
interval data point based upon 100 presentations. 

Presession, Intrasession, and Postsession con- 
trol Ps were computed for each session, and 
means and standard deviations were determined 
based upon five sessions. 

A P; value was determined for each dark inter- 
val in a given session and a P: and standard de- 
viation eomputed based upon the five sessions. 


Results 


Figure 8 depicts the increase in P, which 
occurred as a function of inereasing the dark 
interval in the standard flash-pair accom- 
panied by electrocutaneous stimulation at 
—25 milliseconds. As the dark interval was 
increased from 85 to 105 milliseconds, Р» 
increased from 0.110 to 0.850 for SN and 
from 0.140 to 0.860 for RF. 

Control Data. The Presession, Intra- 


session, and Postsession P, control values 
were 0.170, 0.110, and 0.150 for SN and 
0.180, 0.140, and 0.120 for RF, respectively, 
The respective standard deviations of these 
values were 0.051, 0.037, and 0.055 for SN 
and 0.060, 0.097, and 0.051 for RF. An 
analysis of variance between the control 
measures revealed no significant differences 
for either O (p > .05). ' 


ExPERIMENT E. THE PROBABILITY Or Dis- 
CRIMINATING Two SUCCESSIVE FLASHES 
ОЕ LIGHT AS A FUNCTION or DARK 
INTERVAL FOR A GIVEN CONDI- 
TION ОЕ DIFFERENTIAL FLASH 4 

LUMINANCE 


Method 


Stimulus-Response Conditions. In Experiment 
C, it was demonstrated that P, decreased as a 
function of inereasing the luminance of the first 
flash in the standard flash-pair. From this func- 
tion, it was determined that a luminance increase 
of 0.39 log millilambert added to the first flash was 
sufficient to lower Р» from 0.840 to 0.110 for SN 
and from 0.850 to 0.140 for RF. "Therefore, this is 
the luminance increment necessary to produce 
approximately the same decrement in Р, as did 
the electrocutaneous stimulus when accompany- 
ing the standard flash-pair at a A; value of —25 
milliseconds in Experiment A. In the present ` 
experiment, the standard flash-pair is presented 
with this 0.39 log millilambert increment added 
to the first flash. The purpose of this experiment 
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was to determine the increase in the dark interval 
‚ to the approximate control 
leve! of 0.800 under this condition of differential 


` Juminance. The actual values of dark interval used 


were 85, 95, 105, 120, and 130 milliseconds, Optical 
System B was used for the presentation of the 
flash-pairs. Each O was given 5 minutes of dark 
adaptation before each experimental session, 
period of from 10 to 15 seconds was allowed be- 
tween each stimulus presentation. 

Experimental Design. A constant stimulus 
identical with that in Experiment D was 


presentations for each of the five values of dark 
interval indicated above. Data from five sessions 
were obtained for each O. As indicated, one of the 
values of dark interval (85 milliseconds) was the 
same as that used in the Presession control presen- 
tations. This allowed the 
Intrasession control value. A Postsession control 
P. was determined by the same method used to 
determine the Presession control. 

Presession, Intrasession, 
trol Руз were computed for each session and means 
and standard deviations determined for the five 
sessions. The Ps for each dark interval value was 
computed for each session and the mean and stand- 
ard deviation determined for the five sessions. 


Results 


The data indicate that Ps increased from 
0.120 to 0.840 for SN and from 0.130 to 
0.860 for RF as the dark interval was 
lengthened from 85 to 130 milliseconds. 
These results are depicted in Figure 8 to- 
gether with the results of Experiment D for 
comparison. It is seen that for both Os, the 
dark interval necessary to return Ps to the 
control level of 0.800 was approximately 25 
milliseconds longer for the differential lumi- 
nance. flash-pair than for the stand: 

flash-pair preceded by the electrocutaneous 
stimulus at a At value of —25 milliseconds. 
Presession, Intrases- 
sion, and Postsession threshold control 
values were 0.180, 0.120, 0.130 for SN and 
0.160, 0.130, and 0.150 for RF, respectively. 
The respective standard deviations of these 
5. 0.051, and 0.068 for SN, 


values were 0.075, 
and. 0.037, 0.093, and 0.032 for RF. An 


analysis of variance of the control measures 
revealed no significant differences for either 


* © (p> 05) 


EXPERIMENT F. THE PROBABILITY oF Dis- 
CRIMINATING Two SUCCESSIVE FLASHES 
or LIGHT A8 A FUNCTION OF ELEC- 
TROCUTANEOUS STIMULUS 
INTENSITY 


Method 


Stimulus-Response Conditions, In the present 
experiment, as in Experiment A, the standard 
flash-pair was accompanied by electrocutaneous 
stimulation. However, rather than use а lengthy 
po which would result in а family of 

, = f(A) curves with electrocutancous intensity 
as a ter, a more expedient method was 
used. Р, was investigated as a function of electro- 
cutaneous intensity {ог three given values of 
Ay (te, —25 and —50 milliseconds). 

Optical System A was used for the presentation 
of the visual stimuli. Each O was given 5 minutes 
of dark adaptation before the experimental ses- 
sion. A period of from 10 to 15 seconds was allowed 
between stimulus presentations. 

Experimental Design. The experimental session 
again began with an initial determination of the 

ion control P; by the consecutive presenta- 
tion of 20 standard flash-pairs. This was followed 
by the presentation of 100 standard flash-pairs 
accompanied by different values of electrocutane- 
ous intensity but with a given A value which was 
held constant throughout the session. Discrete 
subthreshold through just suprathreshold intensity 
values differing by approximately 0.04 milliampere 
used. Suprathreshold intensity values were 
by 0.50 milliampere. Five 
values of electrocutaneous intensity were in 
a single session, there being 20 random presenta- 
electrocutaneous intensity values were obtained 
for each O. This resulted i 
ona total of 100 presentations for a given combina- 


mined at the end of each session by the consecu- 
tive presentation of 20 standard flash-pairs. 

Presession Pys were 
determined for each session and means and stand- 


ard deviations computed. 
deviation were obtained for each electrocutaneous 


intensity based on the five sessions. 


Results 
Figure 9 presents P, as a function of sub- 


intensity for the At values of 
to, —29, and —50 milliseconds. Absolute 
based on the 
Presession control data obtained in Experi- 
ment À are indieated on the abscissas by 
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Ета. 9. The probability of discriminating two successive flashes of light as a function of electrocu- * 
taneous stimulus intensity for three given values of 


absolute electrocutaneous threshold intensity. 


vertical arrows. These threshold values were 
0.63 milliampere for SN and 0.57 milliam- 
pere for RF. In the group of sessions con- 
cerned with the threshold range of inten- 
sity, the electrocutaneous stimulus was 
varied from 0.56 to 0.72 milliampere for SN 
and from 0.52 to 0.68 milliampere for RF. 
Over this threshold range, P; decreased from 
the control level (approximately 0.800) to 
different minimum values for each Д, at 
about the same just suprathreshold electro- 
cutaneous intensity. Presumably, this de- 
crease was related to the increasing detecta- 
bility of the electrocutaneous stimulus as its 
intensity was increased, 


At. Vertical arrow on abscissa indicates average 


With the exception of the above-men- 
tioned decrease, P; maintained a different 
but constant level for each particular value 
of A; as electrocutaneous intensity was in- 


| 


creased from a "low" to a “high” value. t 


Hence, excluding threshold considerations, 
Р; was not found to vary systematically as а 
function of the suprathreshold electrocu- 
taneous stimulus intensities used in the 
present study. 

The P level for each A; value found for 
suprathreshold intensity offers a check on 
the data obtained in Experiment A. It « 
might be expected that the P, level observed 
in the present experiment for each value of 


Б, control level 


ELECTROCUTANEOUS STIMULATION 


A, would be consistent with the P. value 
obtained for that temporal separation in 
Experiment A. А comparison indicates that 
the results of both experiments are approxi- 
mately consistent; however, the Р, levels for 
the values of —25 and —50 milliseconds 
obtained in the present experiment do tend 
to be somewhat higher than the comparable 
values in Experiment A. This is possibly due 
to procedural differences between the two 
experiments. 

Control Data. The Presession and Post- 
session control Ps’s were 0.760 and 0.760 for 
SN and 0.797 and 0.790 for RF, respectively. 
The respective standard deviations of these 
values were 0.064 and 0.066 for SN and 0.055 
and 0.044 for RF. A t test between the 
control measures revealed no significant 
differences for either О (p > .05). 


DISCUSSION 


After inspection of the funetions obtained 
in Experiment A (Figure 3), it was hypothe- 
sized that the electrocutaneous stimulus was 
affecting each of the flashes in some manner. 
This hypothesis led to the predietion that 
using a flash-pair with a longer dark interval 
might yield a funetion in which P: would 
fully return to the control level at some 
value of A, between the two flashes. This 
would, in effect, render a distinct Ps data- 
for each of the flashes. 
not possible to test 


flash luminance would have resul 
bringing the flashes 
threshold. It was therefore decided to es- 


apparent brightness of 
ever, any experimental design involving а 


brightness match 
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cessory stimulation would involve some 
technique of successive comparison, Such a 
design suggested too many difficulties and 
was deemed unacceptable. А measure of 
absolute visual threshold was therefore used. 

The results of Experiment B supported 
the hypothesis that electrocutancous stimu- 
lation could affect the detection of a single 
flash of light. The fact that the minimum 
absolute threshold occurred at a A, value of 
—95 milliseconds brings the findings of both 
experiments into direct relationship. Actu- 
ally, in the single-flash experiment, it would 
not have been deemed unusual if the mini- 
mum visual threshold had occurred at a nega- 
tive value of Ay somewhat greater than —25 
milliseconds. This would be predicted from 
the expected longer latency of retinal dis- 
charge to à threshold flash. Such a shift in 
the critical value of A, may have in fact 
occurred but was not detected by the 
5-millisecond resolution used in Experiment 
B. This reasoning implies, of course, that 
the value of —25 milliseconds obtained in 
both experiments represents а difference in 
conduction time for impulses produced by 
the electrocutaneous and visual stimulation 
to arrive at some locus (or loci) in the central 
nervous system. 

The lowering of absolute visual threshold 
by electrocutaneous stimulation sugges 
that the mechanism responsible for the de- 
P. in Experiment A might be а 
systematic alteration in the apparent bright- 
ness of the flashes as the electrocutaneous 
stimulus was moved from negative through 
hypothesis, 
stimulus produced a 
maximal increase in the apparent brightness 
of the first flash at a д: value of —25 milli- 
seconds. The increased brightness of the first 
flash might then “mask” the second flash 
and result in the observed decrease in Р,. 
As the electrocutaneous stimulus was moved 
into positive time, its effect on the first flash 
diminished and Ps increased. However, be- 
fore Р» could fully return to the control level 
of 0.800, the electrocutaneous stimulus ex- 
erted an increasing effect on the second flash 
in the standard flash-pair. The corresponding 
brightness increase of the second flash then 
produced a second decrease in Po. 

Clearly, one test of the above brightness 


the electrocutaneous 
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enhancement hypothesis involves the study 
of P, as a function of the independent 
manipulation of the luminance of each of 
the flashes without the use of accessory 
electrocutaneous stimulation. As shown in 
Figure 7, increasing the luminance of the 
first flash by 0.39 log millilambert does in 
fact result in a decrease in P, from approxi- 
mately 0.800 to 0.130 for both Os. However, 
it was observed that a comparable increase 
of 0.39 log millilambert in the luminance of 
the second flash in the standard flash-pair 
produced no appreciable change in P». It 
would therefore seem that the failure of Р» 
to decrease when the luminance of the second 
flash was increased is inconsistent with the 
brightness enhancement hypothesis since the 
effect of the electrocutaneous stimulus was 
to produce an equal decrease in Р» when it 
occurred 25 milliseconds prior to either flash. 
. A further observation which reduced the 
applicability of the brightness enhancement 
hypothesis is the following: the brightness of 
the standard flash-pair accompanied by an 
electrocutaneous stimulus temporally set for 
maximal effect before the first flash was al- 
ways judged as “dimmer” than the hypo- 
thetically comparable situation in which the 
standard flash-pair was viewed without ac- 
cessory stimulation but with the 0.39-log- 
millilambert increment in the luminance of 
the first flash. 

To objectively resolve this question 
utilizing comparable techniques, Experi- 
ments D and E were conducted with similar 
psychophysical techniques and both used 
suprathreshold double flashes. In Experi- 
ment D, the standard flash-pair was accom- 
panied by an electrocutaneous stimulus prior 
to the first flash at the A, value of —25 
milliseconds which had produced the maxi- 
mum lowering of P;. In Experiment E, 
there was no electrocutaneous stimulus, but 
there was an increment of 0.39 log milli- 
lambert in the luminance of the first flash. 
Thus the starting point of both Experiments 
D and E was a similar value of P; (approxi- 
mately 0.130). The independent variable 
for both experiments was the increase in 
the dark interval necessary to return Р» to 
the control level of 0.800. According to 
the brightness-enhancement hypothesis, the 


standard flash-pair with the 0.39-log-milli- 
lambert increment in the luminance of the 
first flash and the standard flash-pair pre- 
ceded by electrocutaneous stimulation at 
—25 milliseconds are comparable events 
since they produced the same minimum Ps . 
By determining the increase in the dark 
interval necessary to return P; to the control 
level of 0.800 for both of these conditions, it 
becomes possible to objectively compare 
them. It might be expected that if the 
electrocutaneous stimulus was lowering Р» 
by producing an increase in the luminance 
of the first flash comparable to actually 
adding 0.39 log millilambert, then the 
lengthening of the dark interval should 
yield similar curves for both conditions. As 
seen in Figure 8, the data for the two ex- 
periments yielded distinctly different func- 
tions. Hence, the data do not support the 
hypothesis of brightness enhancement as the 
mechanism underlying the effect of the 
electrocutaneous stimulus on double-flash 
resolution. 

The rejection of the brightness-enhance- 
ment hypothesis posed the task of examining 
alternative hypotheses concerning mecha- 
nism. Foremost among these alternatives is 
the temporal specification or “warning sig- 
nal” hypothesis. Howarth and Treisman’s 
(1958) finding that the temporal specifica- 
tion of an auditory intensity increment by a 
visual warning signal can lower auditory 
threshold is directly relevant to the inter- 
pretation of Experiment B, in which the 
electrocutaneous stimulus resulted in the 
lowering of threshold for a single flash. The 
assumption that the electrocutaneous stimu- 
lus temporally specified the occurrence of 
the threshold test flash and resulted in an 
increased detectability seems a reasonable 
possibility. 

The possible relevance of the warning 
signal hypothesis is given additional support 
by the results obtained in Experiment F in 
which increasing the intensity of the electro- 
cutaneous stimulus did not alter Р» for the 
double flash. Halliday and Mingay (1961) 
state that if the sole function of a warning 
signal or “marker” is to temporally specify 
the occurrence of a test stimulus, then in- 
creasing the intensity of the marker should 


ELECTROCUTANEOUS STIMULATION AND THE Derecriox or Frasnes 17 


t facilitate its function in increasing the 

etectability of the test stimulus. 

This view might be qualified when it is 
considered that neural conduction time 
might vary as a function of warning signal 
intensity. Given an optimal warning time, 
it seems plausible to expect that altering 
the intensity of the electrocutaneous stimu- 
lus should produce some change in detecta- 
bility due to conduction time changes. As 
indicated, however, no systematic changes 
in Р» as a function of suprathreshold inten- 
sity were found at any of the three A, values 
tested (to, —25, and —50 milliseconds). 
+ However, such conduction-time changes are 
| probably small and any resulting change in 
` P» might well be obscured by the variability 
` in the data. Towe and Amassian (1958), 

recording from single units on the somato- 

sensory cortex of anaesthetized monkeys, 
` varied an electrocutaneous digital stimulus 
from “low? to “high” intensity and ob- 
served a maximum conduction-time change 
of only approximately 2 milliseconds. 
Although the lowering of absolute thresh- 
old in Experiment B and the invariance of P: 
as a function of electrocutaneous intensity 
“detection” 
hypothesis, certain difficulties exist. First, 
there is nothing in the detection hypothesis 
which would indicate that 25 milliseconds 
is an optimal warning period. Secondly, 
Howarth and Treisman report a lowering of 
auditory threshold only when using & method 
of limits design involving the repetition of a 
given warning interval. They failed, however, 
to find similar results when employing ran- 
domized warning intervals in а constant 
stimulus design. 


La 


ject having prior ; 
temporal interval between à warning signal 
and a test stimulus. This is in contradiction 
to the results of Experiment 
randomized intervals were used. Therefore, 
as with the prightness-enhancement hy- 
pothesis, the data of the present study are 
only partially consistent with a detection 
hypothesis. 

The quest for an adequate model to en- 
= compass the present findings is not much 


furthered by turning to the neurophysio- 
logical substrate of “alerting.” While there 
exists an abundance of neurophysiological 
data (Jasper, 1958; Magoun, 1958) indi- 
eating that the "attention" or stimulus 
receptivity of the organism may be altered 
by so-called “alerting” or cortical activa- 
tion, none of the available data suggest that 
there is a critical interval of 25 milliseconds 
for the enhancement of stimulus receptivity. 
The available data on electroencephalic 
desynchronization are in a completely differ- 
ent time domain with respect to either 
latency or time course of desynchronization. 
However, this does not rule out the presence 
of more subtle alterations of excitability 
than gross electroencephalic desynchroniza- 
tion, and the data of the present study 
would ‘suggest that explorations in this 
direction might prove fruitful. 

In conclusion, it may be stated that the 
present study has yielded data which are 
inconsistent with any of the available con- 
ceptual frameworks. It is to be hoped that 
this will contribute toward the rethinking of 
assumptions and models in the fields of 
sensory interaction and detection theory. 


SuMMARY 


In Experiment A, two flashes separated 
by an 5-millisecond dark interval were 
presented successively to the same locus on 
the fovea of O's dark-adapted right eye. 
Each flash was 1 millisecond in duration, 
—0.20 log millilambert in luminance, and 
subtended 0.80? of visual angle. Under these 
conditions, the flashes were discriminated as 
two events (Рз) approximately 80% of the 
time. A 1-millisecond electrocutaneous stim- 
ulus (DC square-wave pulse) was delivered 
to а digit on O's ipsilateral hand in varying 
temporal relationship (A) to the flashes. A 
minimum P; of approximately 13% was 
observed when the electrocutaneous stimu- 
lus was presented 25 milliseconds before the 
onset of the first or second flash. 

In order to obtain information about the 
mechanism underlying this phenomenon, 
the effect of electrocutaneous stimulation 
on the absolute visual threshold for a single 
]-millisecond flash was then investigated. A 
maximum decrease of approximately 0.10 
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log millilambert in absolute threshold was 
observed for both Os when the electrocu- 
taneous stimulus preceded the onset of the 
flash by 25 milliseconds (Experiment B). 
Based on this observation, it was hypothe- 
sized that the reduction in P; in Experiment 
A resulted from a brightness enhancement 
of either of the two flashes depending on the 
relative temporal position of the electro- 
cutaneous stimulus. To test whether a 
brightness inerease would actually result in 
a lowering of P, , the luminance of the first 
flash in the flash-pair was inereased (Experi- 
ment C). It was found that an increase of 
0.39 log millilambert in the luminance of 
the first flash lowered P, over the approxi- 
mate range observed for both Os in Experi- 
ment A. ч. ' 
Although P, decreased as a function of 
increasing the luminance of the first flash 
in the flash-pair, certain discrepancies indi- 
cated the need for further experimentation 
before accepting or rejecting the “brightness 
enhancement hypothesis.” It was observed 
that the brightness of the flash-pair accom- 
panied by an electrocutaneous stimulus set 
for maximal effect before the first flash 
appeared markedly dimmer than the hypo- 
thetically comparable situation in which the 


standard flash-pair was viewed without, 
electrocutaneous stimulation but with th 
0.39-log-millilambert luminance increme 
added in the first flash. 

Experiments D and E were performed i 
order to resolve the issue. In Experiment D, 
the flash-pair was accompanied by an elec- 
trocutaneous stimulus set before the first 
flash to produce the minimum Ps. In Ex- 
periment E, the flash-pair was used without 
electrocutaneous stimulation but with an 
0.39-log-millilambert increment in the 
luminance of the first flash to produce the 
same minimum P,. In both experiments, 
the dark interval was systematically in- 
creased, and the corresponding increase in 
Р» was measured. Two distinetly different 
P, functions were obtained as a result of 
increasing the dark interval in the two cases. 
Based on these observations, the brightness 
enhancement hypothesis was rendered highly 
improbable and rejected. In another series 
of experiments, it was shown that Ps was not 
a function of suprathreshold electrocutane- 
ous intensity. Due to this and related find- 
ings, alternative approaches based on signal 
detection and electrophysiological “alerting” 
data were considered. 
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